Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] compression streams writing wrong format #2458

Merged
merged 3 commits into from
Mar 24, 2021

Conversation

eseiler
Copy link
Member

@eseiler eseiler commented Mar 21, 2021

Documentation will be follow-up PR

@eseiler eseiler requested review from a team and simonsasse and removed request for a team March 21, 2021 16:18
@vercel
Copy link

vercel bot commented Mar 21, 2021

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://vercel.com/seqan/seqan3/2LHgxV53uxvANUxuLoUpzk9ebJmK
✅ Preview: https://seqan3-git-fork-eseiler-fix-gzbgzf-seqan.vercel.app

@codecov
Copy link

codecov bot commented Mar 21, 2021

Codecov Report

Merging #2458 (c39b046) into master (d7de7a5) will increase coverage by 0.00%.
The diff coverage is 100.00%.

❗ Current head c39b046 differs from pull request most recent head 217cdb1. Consider uploading reports for the commit 217cdb1 to get more accurate results
Impacted file tree graph

@@           Coverage Diff           @@
##           master    #2458   +/-   ##
=======================================
  Coverage   98.30%   98.30%           
=======================================
  Files         267      267           
  Lines       11063    11065    +2     
=======================================
+ Hits        10875    10877    +2     
  Misses        188      188           
Impacted Files Coverage Δ
include/seqan3/io/detail/misc_input.hpp 96.22% <ø> (ø)
include/seqan3/io/sam_file/output.hpp 100.00% <ø> (ø)
include/seqan3/io/detail/magic_header.hpp 100.00% <100.00%> (ø)
include/seqan3/io/detail/misc_output.hpp 94.11% <100.00%> (+0.78%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d7de7a5...217cdb1. Read the comment docs.

@simonsasse simonsasse linked an issue Mar 22, 2021 that may be closed by this pull request
Copy link
Member

@simonsasse simonsasse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good :)

@@ -70,3 +74,52 @@ TEST(misc, valid_compression_extensions)
EXPECT_TRUE(std::find(valid_compression.begin(), valid_compression.end(), "zst") != valid_compression.end());
#endif
}

template <typename compression_t>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally would find it better, if this would be in a new test test/unit/io/detail/misc_output_test.cpp

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

@@ -70,3 +74,52 @@ TEST(misc, valid_compression_extensions)
EXPECT_TRUE(std::find(valid_compression.begin(), valid_compression.end(), "zst") != valid_compression.end());
#endif
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the above seem like they belong to test/unit/io/detail/misc_input_test.cpp

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left it in misc_test (didnt rename it), because it does not use misc_input.hpp

Comment on lines 96 to 98
using compression_types = ::testing::Types<seqan3::detail::gz_compression,
seqan3::detail::bgzf_compression,
seqan3::detail::bz2_compression>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you double-check whether all of this compiles if gz, bgzf and bz2 aren't defined?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It did not but now it does

EXPECT_TRUE(seqan3::detail::starts_with(magic_header, this->expected_magic_header));
EXPECT_FALSE(seqan3::detail::bgzf_compression::validate_header(std::span{magic_header}));
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be easier to understand if this would be 3 test cases instead of 1 typed test as they have something in common, but are also quite different.

I mean the original problem was only gzip <=> bgzf

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's better when it's split.
I did some more convoluted stuff before, and it did save a lot of code to have a fixture.
Then I managed to simplify everything and did not re-evaluate the usage of a fixture


std::ifstream filestream{this->file_path()};
std::array<char, this->biggest_magic_header_size> magic_header{};
std::copy_n(std::istreambuf_iterator{filestream}, this->biggest_magic_header_size, magic_header.begin());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't something along the lines

std::vector<char> file_content{std::istreambuf_iterator{filestream}, std::istreambuf_iterator{}};

easier to understand?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup.

I needed to fix some stuff in our span implementation, mainly also stripping cv qualifiers in addition to ref when checking concepts and the deduction guide for ranges.

@eseiler eseiler force-pushed the fix/gz_bgzf branch 2 times, most recently from c3cb612 to d2071a9 Compare March 23, 2021 11:03
Copy link
Member

@marehr marehr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very Nice! Looks much better, I didn't double-check whether the span changes do line-up with the standard.

@marehr marehr merged commit c3883c5 into seqan:master Mar 24, 2021
eseiler added a commit to eseiler/seqan3 that referenced this pull request Mar 24, 2021
marehr pushed a commit that referenced this pull request Mar 25, 2021
* [DOC] Add changelog entry for #2458

* Update CHANGELOG.md

Co-authored-by: Lydia Buntrock <[email protected]>

Co-authored-by: Lydia Buntrock <[email protected]>
@eseiler eseiler deleted the fix/gz_bgzf branch May 14, 2021 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

gz output writes bgzf
3 participants