-
Notifications
You must be signed in to change notification settings - Fork 442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using bgzf_read_small()
breaks subsequent bgzf_useek()
calls
#1798
Comments
I'd checked that the |
Oops. I was sure I'd got both of those incremented, but I confess I had many alternative versions while benchmarking and I guess I lost something along the way. Sorry. I'll try and get a test that fails before applying the fix. Thanks for reporting it. |
I spotted this while changing bgzf_read_small(…, buf1, 5); // read five characters
bgzf_useek(…, 10); // seek to offset 10
bgzf_read_small(…, buf2, 3); // read another few characters; this one could be bgzf_read()
assert(buf2 is the three characters at offset 10…12); This direct test case will fail because it will have read three characters from a different offset. |
This bug crept in with samtools#1772 which was added since last release, so there is no regression. Fixes samtools#1798 with thanks to John Marshall
I was already working on something similar in test_bgzf.c which I've now added to verify I could trigger the bug, and then also applied the trivial one line fix. Thanks. |
This bug crept in with samtools#1772 which was added since last release, so there is no regression. Fixes samtools#1798 with thanks to John Marshall
PR #1772 added a simplified inline version of
bgzf_read()
that uses inline code when the request can be satisfied directly from the buffer, otherwise punts to the realbgzf_read()
. Very laudable. However I encountered seek problems when using it.I rederived the inline function by starting with the code of the full
bgzf_read()
, assuming the invariant thatlength < fp‑>block_length - fp‑>block_offset
, and simplifying the code accordingly. I ended up with a function that is fairly similar tobgzf_read_small()
as added to htslib/bgzf.h but with some additions:Because
bgzf_read_small()
does not currently updatefp‑>uncompressed_address
, subsequentbgzf_useek()
calls may jump to the wrong location. And probably other functions usefp‑>uncompressed_address
too and are affected.The
if
block looks harder to deal with, because it callsbgzf_htell()
which is private within bgzf.c. However it turns out that the invariant implies that thisif
will never be true, so in fact I should have simplified it away to nothing too. Phew.So
bgzf_read_small()
just needsfp->uncompressed_address += length;
added to it to make it equivalent tobgzf_read()
.(
I have not analysedI checked and believe there are no similar problems inbgzf_write_small()
to see if it has any similar infelicities.bgzf_write_small()
.)The text was updated successfully, but these errors were encountered: