Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "uncompressed" in hts_format_description() where appropriate #1656

Merged
merged 1 commit into from
Jul 31, 2023

Conversation

jmarshall
Copy link
Member

In samtools/samtools#1884 we saw a BAM file downloaded via a web browser and ungzipped. htsfile reported this file as follows:

$ htsfile wgEncode-browser.bam
wgEncode-browser.bam:	BAM version 1 sequence data

In isolation it's not obvious that this is reporting that the BAM file is not BGZF-compressed as normal. (For usual BAM files, htsfile reports BAM version 1 compressed sequence data, but if you don't have one handy to compare…)

This PR adds “uncompressed” for uncompressed files in formats, like BAM and BCF, that are normally compressed, to make this clear. Thus:

$ htsfile wg*.bam
wgEncode-browser.bam:	BAM version 1 uncompressed sequence data
wgEncode-curl.bam:	BAM version 1 compressed sequence data

For formats like BAM and BCF that are normally compressed, report
it explicitly when encountering a raw uncompressed such file.
See samtools/samtools#1884 for motivation.
@whitwham whitwham merged commit 5098983 into samtools:develop Jul 31, 2023
@jmarshall jmarshall deleted the highlight-uncompressed-bam branch July 31, 2023 20:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants