pyutils.compress package

This subpackage includes code related to data compression.

Submodules

pyutils.compress.letter_compress module

This is a simple, honestly, toy compression scheme that uses a custom alphabet of 32 characters which can each be represented in six bits instead of eight. It therefore reduces the size of data composed of only those letters by 25% without loss.

pyutils.compress.letter_compress.compress(uncompressed: str) bytes[source]

Compress a word sequence into a stream of bytes. The compressed form will be 5/8th the size of the original. Words can be lower case letters or special_characters (above).

Parameters:

uncompressed (str) – the uncompressed string to be compressed

Returns:

the compressed bytes

Raises:

ValueError – uncompressed text contains illegal character

Return type:

bytes

>>> import binascii
>>> binascii.hexlify(compress('this is a test'))
b'a2133da67b0ee859d0'
>>> binascii.hexlify(compress('scot'))
b'98df40'
>>> binascii.hexlify(compress('scott'))  # Note the last byte
b'98df4a00'
pyutils.compress.letter_compress.decompress(compressed: bytes) str[source]

Decompress a previously compressed stream of bytes back into its original form.

Parameters:

compressed (bytes) – the compressed data to decompress

Returns:

The decompressed string

Return type:

str

>>> import binascii
>>> decompress(binascii.unhexlify(b'a2133da67b0ee859d0'))
'this is a test'
>>> decompress(binascii.unhexlify(b'98df4a00'))
'scott'

Module contents