Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XM data corruption #29

Open
KirillKryukov opened this issue Jun 19, 2019 · 0 comments
Open

XM data corruption #29

KirillKryukov opened this issue Jun 19, 2019 · 0 comments

Comments

@KirillKryukov
Copy link

XM compressor still has data corruption issue. Compressing some input and decompressing it back produces corrupted output. I.e., decompressed data is different from original file.

Test data size: 30,244 bytes
Test data link: http://kirill.med.u-tokai.ac.jp/data/temp/xm-repro-4-input.zip

Commands to reproduce:

Compress:
jsa.xm.compress --hashSize=11 --context=15 --limit=200 --threshold=0.15 --chance=20 --real=archive.xm original.fasta

Decompress:
jsa.xm.compress --hashSize=11 --context=15 --limit=200 --threshold=0.15 --chance=20 --decode=archive.xm --output=decompressed.fasta

Compare:
cmp original.fasta decompressed.fasta

Produces: original.fasta decompressed.fasta differ: byte 27512, line 274

The decompressed file has correct size, but corrupted sequence data. It was found during testing for Sequence Compression Benchmark.

Let me know if you need any additional information or help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant