-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inline naive_kmer_hash and naive_miniser_hash #2967
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
Codecov Report
@@ Coverage Diff @@
## master #2967 +/- ##
=======================================
Coverage 98.22% 98.22%
=======================================
Files 267 267
Lines 11511 11511
=======================================
Hits 11307 11307
Misses 204 204 Continue to review full report at Codecov.
|
How do the benchmark timings change with this patch? |
The benchmarks are very similar, for some reason there is more change in current Version:
PR version
|
There doesn't seem to be too much change? We also do Lines 23 to 26 in e5c63f9
The value could also be In general, microbenchmarks are quite finicky and might even change when recompiling :) google benchmark also offers some options, here is an example I used for I/O:
|
auto subRangeEnd = subRangeBegin; | ||
for (size_t i{1}; i < k; ++i && subRangeEnd != end(seq)) | ||
++subRangeEnd; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto subRangeEnd = subRangeBegin; | |
for (size_t i{1}; i < k; ++i && subRangeEnd != end(seq)) | |
++subRangeEnd; | |
auto subRangeEnd = std::ranges::next(subRangeBegin, k - 1, end(seq)); |
Should be equivalent. Copies subRangeBegin
, increments it k-1
times, and uses end
as boundary.
The only thing I'm not sure about is the k-1
:D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, much nicer solution!
@@ -104,8 +103,22 @@ static void naive_kmer_hash(benchmark::State & state) | |||
|
|||
for (auto _ : state) | |||
{ | |||
for (auto h : seq | seqan3::views::naive_kmer_hash(k)) | |||
// creates a initial subrange covering exactly 'k-1' characters | |||
auto subRangeBegin = begin(seq); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming should be subrange_begin
, subrange_end
, I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! much better!
The actual goal is to remove the usage of ranges::views::sliding. This is easiest achieved by removing complete of naive_kmer_hash_fn. -A life Without Range-v3-
c90e78b
to
6772123
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am currently mainly working for the FONDA project, but I have time so I am happy to review some PRs.
So you have to live with some possibly uninformed questions. ;)
The way I understand it is this:
We want to get rid of range-v3
-> we want to get rid of ranges::views::sliding
-> the only place where it is used is in the naive_kmer_hash_fn
-> the naive_kmer_hash_fn
is only used in a single benchmark, so we reimplement its use there and delete it.
My questions:
- Is this really the only place where
ranges::views::sliding
is used in SeqAn3? I have no idea, but it seems to me like this functionality might be actually used a lot. If that's the case, wouldn't it be better to reimplement the sliding view and leave everything here as is? - Are we really sure that we will never need the naive kmer hash function for any other benchmarks? We could also keep the
naive_kmer_hash_fn
and replace the sliding view there with our implementation.
The overall goal is to remove the usage of ranges::views::sliding. This commit removes naive_minimiser_hash
There was a second place where Yes, you understand it correctly. This is the only place where This also removes only the "naive" approaches. If you actually want to use the k-mer hashes you would use one of the other views provided by seqan. Those are a lot faster. It still is possible to implement a sliding-view in the future. |
Just adding to what @SGSSGene said:
If you ever are busy with FONDA, you can just remove your request and reassign the team.
Don't worry, there are no stupid questions and your questions are perfectly valid :)
👍
It's the only place we use the range-v3 |
The actual goal is to remove the usage of ranges::views::sliding. This
is easiest achieved by removing complete of
naive_kmer_hash
andnaive_minimizer_hash
This is part of seqan/product_backlog#124
-A life Without Range-v3-