Here's the definition of PKCS7 padding (from
RFC 2315):
RFC 2315, section 10.3, note #2:
2. Some content-encryption algorithms assume the
input length is a multiple of k octets, where k > 1, and
let the application define a method for handling inputs
whose lengths are not a multiple of k octets. For such
algorithms, the method shall be to pad the input at the
trailing end with k - (l mod k) octets all having value k -
(l mod k), where l is the length of the input. In other
words, the input is padded at the trailing end with one of
the following strings:
01 -- if l mod k = k-1
02 02 -- if l mod k = k-2
.
.
.
k k ... k k -- if l mod k = 0
The padding can be removed unambiguously since all input is
padded and no padding string is a suffix of another. This
padding method is well-defined if and only if k < 256;
methods for larger k are an open issue for further study.
And here's how to implement it in python:
class PKCS7Encoder():
"""
Technique for padding a string as defined in RFC 2315, section 10.3,
note #2
"""
class InvalidBlockSizeError(Exception):
"""Raised for invalid block sizes"""
pass
def __init__(self, block_size=16):
if block_size < 2 or block_size > 255:
raise PKCS7Encoder.InvalidBlockSizeError('The block size must be ' \
'between 2 and 255, inclusive')
self.block_size = block_size
def encode(self, text):
text_length = len(text)
amount_to_pad = self.block_size - (text_length % self.block_size)
if amount_to_pad == 0:
amount_to_pad = self.block_size
pad = chr(amount_to_pad)
return text + pad * amount_to_pad
def decode(self, text):
pad = ord(text[-1])
return text[:-pad]
Example use:
>>> # basic use
>>> encoder = PKCS7Encoder()
>>> padded_value = encoder.encode('hi')
>>> padded_value
'hi\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e'
>>> len(padded_value)
16
>>> encoder.decode(padded_value)
'hi'
>>> # empty string
>>> padded_value = encoder.encode('')
>>> padded_value
'\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10'
>>> len(padded_value)
16
>>> encoder.decode(padded_value)
''
>>> # string that is longer than a single block
>>> padded_value = encoder.encode('this string is long enough to span blocks')
>>> padded_value
'this string is long enough to span blocks\x07\x07\x07\x07\x07\x07\x07'
>>> len(padded_value)
48
>>> len(padded_value) % 16
0
>>> encoder.decode(padded_value)
'this string is long enough to span blocks'
>>> # using the max block size
>>> encoder = PKCS7Encoder(255)
>>> padded_value = encoder.encode('hi')
>>> len(padded_value)
255
>>> encoder.decode(padded_value)
'hi'