Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incompatibility with LZOP #67

Open
DeusExLibris opened this issue Feb 9, 2023 · 1 comment
Open

Incompatibility with LZOP #67

DeusExLibris opened this issue Feb 9, 2023 · 1 comment

Comments

@DeusExLibris
Copy link

I am trying to decompress data that was compressed using the LZOP utility but it seems that python-lzo does not understand the header for the file. In my testing, it seems that the incompatibility is two way.

To test this, I took a sample text file and compressed it using LZOP:
lzop -c -5 -o file.lzo file.txt

I then tried to decompress this using python-lzo:
import lzo
with open('file.lzo', 'rb') as file:
data = file.read()
a = lzo.decompress(data)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
lzo.error: Header error - invalid compressed data

I then tried the reverse:
import lzo
with open('file.txt', 'r') as file:
data = file.read()
b = lzo.compress(data, 5, 1) # compression level 5, include header
newFile = open('file.lzo', 'wb')
newFile.write(b)
newFile.close()

I then try to decompress this file with LZOP:
lzop -d file.lzo

lzop: file.lzo: not a lzop file

When I look at the two compressed files using hexdump, I see that the file compressed python-lzo has a very limited header (7 bytes) that does not match anything in the header of the file compressed by LZOP which matches the header definition found here - https://gist.github.com/jledet/1333896.

python-lzo header:
00000000 f0 01 f1 66 f2 00 02 |...f...|

LZOP header:
00000000 89 4c 5a 4f 00 0d 0a 1a 0a 10 40 20 a0 09 40 01 |.LZO......@ ..@.|
00000010 05 03 00 00 01 00 00 81 a4 63 e4 37 fd 00 00 00 |.........c.7....|
00000020 00 08 66 69 6c 65 2e 74 78 74 73 21 08 3a 00 04 |..file.txts!.:..|
00000030 00 00 00 01 04 b7 b8 02 e3 a6 00 02 |............|

When I look in the source for python-lzo, I see that it has the code to process the type of header used by LZOP but I can't make it read or write files that have that header.

@DeusExLibris
Copy link
Author

For anyone looking at this and hoping for a solution, I was able to achieve decompressing a file compressed with LZOP using a combination of python-lzo and some code from python3_lzo_indexer - https://github.com/Orhideous/python3_lzo_indexer.

You have to use a modified form of the code in get_lzo_blocks() to extract a block:

Use read(compressed_blocksize) to read the compressed data block. This block can then be passed to lzo.decompress(block, 0, decompressed_blocksize) to obtain the decompressed block that can be written out or used in whatever way you need.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant