RFC 2315, section 10.3, note #2:
2. Some content-encryption algorithms assume the
input length is a multiple of k octets, where k > 1, and
let the application define a method for handling inputs
whose lengths are not a multiple of k octets. For such
algorithms, the method shall be to pad the input at the
trailing end with k - (l mod k) octets all having value k -
(l mod k), where l is the length of the input. In other
words, the input is padded at the trailing end with one of
the following strings:
01 -- if l mod k = k-1
02 02 -- if l mod k = k-2
.
.
.
k k ... k k -- if l mod k = 0
The padding can be removed unambiguously since all input is
padded and no padding string is a suffix of another. This
padding method is well-defined if and only if k < 256;
methods for larger k are an open issue for further study.
And here's how to implement it in python:
class PKCS7Encoder(): """ Technique for padding a string as defined in RFC 2315, section 10.3, note #2 """ class InvalidBlockSizeError(Exception): """Raised for invalid block sizes""" pass def __init__(self, block_size=16): if block_size < 2 or block_size > 255: raise PKCS7Encoder.InvalidBlockSizeError('The block size must be ' \ 'between 2 and 255, inclusive') self.block_size = block_size def encode(self, text): text_length = len(text) amount_to_pad = self.block_size - (text_length % self.block_size) if amount_to_pad == 0: amount_to_pad = self.block_size pad = chr(amount_to_pad) return text + pad * amount_to_pad def decode(self, text): pad = ord(text[-1]) return text[:-pad]
Example use:
>>> # basic use >>> encoder = PKCS7Encoder() >>> padded_value = encoder.encode('hi') >>> padded_value 'hi\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e' >>> len(padded_value) 16 >>> encoder.decode(padded_value) 'hi' >>> # empty string >>> padded_value = encoder.encode('') >>> padded_value '\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10' >>> len(padded_value) 16 >>> encoder.decode(padded_value) '' >>> # string that is longer than a single block >>> padded_value = encoder.encode('this string is long enough to span blocks') >>> padded_value 'this string is long enough to span blocks\x07\x07\x07\x07\x07\x07\x07' >>> len(padded_value) 48 >>> len(padded_value) % 16 0 >>> encoder.decode(padded_value) 'this string is long enough to span blocks' >>> # using the max block size >>> encoder = PKCS7Encoder(255) >>> padded_value = encoder.encode('hi') >>> len(padded_value) 255 >>> encoder.decode(padded_value) 'hi'
Please fix your code, it pads with integer byte blocks instead of HEX as specified in RFC.
ReplyDeletedef encode:
...
pad = unhexlify('%02x' % amount_to_pad)
...
def decode:
...
pad = int(hexlify(text[-1]), 16)
...
Agree with Bojan comment. Compatible code with RFC:
ReplyDeletedef encode(self, text):
text_length = len(text)
amount_to_pad = self.block_size - (text_length % self.block_size)
if amount_to_pad == 0:
amount_to_pad = self.block_size
#pad = unhexlify('%02d' % amount_to_pad)
pad = chr(amount_to_pad)
return text + pad * amount_to_pad
def decode(self, text):
#pad = int(hexlify(text[-1]))
pad = ord(text[-1])
return text[:-pad]
Thank you, I have updated the code and provided some examples.
ReplyDelete