fix: lazy vcf overreading into next record #224
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi, using the lazy vcf reader, I think I found an issue w.r.t. how the fields are being parsed. In cases where not all fields are present,
read_field
will read into the next line causingrecord.buf
to contain the record in question as it was parsed and the "raw" next record (because of the lastread_line
inread_lazy_record
).For example, if you modify test_read_lazy_record test to have two records,
&b"sq0\t1\t.\tA\t.\t.\tPASS\t.\nsq0\t1\t.\tA\t.\t.\tPASS\t."[..];
, the test fails becauserecord.buf
is actuallysq01.A..PASS.\nsq01\t.\tA\t.\t.\tPASS\t.
(the samples field has the next record).To attempt a fix, I took perhaps the dumbest and least idiomatic approach :), but I thought I'd open it up to see if you agree there's an issue and if so, thoughts on the best way to fix it.
Thanks,