Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alignment configuration free end gaps #2032

Conversation

smehringer
Copy link
Member

@smehringer smehringer commented Aug 13, 2020

Resolves seqan/product_backlog/issues/96

The alignment_configurator adaption is probably not the smartest move but it is part of the old alignment and will be deleted anyway.

@smehringer smehringer requested review from a team and marehr and removed request for a team August 13, 2020 14:49
@smehringer smehringer changed the base branch from master to release-3.0.2 August 13, 2020 15:00
@smehringer smehringer force-pushed the alignment_configuration_free_end_gaps branch 3 times, most recently from 8fcd6e4 to c4186c5 Compare August 13, 2020 15:04
Copy link
Member

@marehr marehr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥇 For this very Ren[n]e like PR 🦺 👷‍♂️

This is just the first round, I have some problem to find the important changes. Even some https://www.youtube.com/watch?v=DU2bgWOln64 couldn't help me xD
I'll go commit-by-commit again, tomorrow.

Is this a follow-up PR of other changes that I'm maybe unaware of?

include/seqan3/alignment/all.hpp Outdated Show resolved Hide resolved
doc/tutorial/pairwise_alignment/index.md Outdated Show resolved Hide resolved
Comment on lines 113 to 116
The global alignment can be further refined by initialising the configuration element with
the free end gap specifiers. They specify whether or not gaps at the end of the sequences are penalised.
In SeqAn you can configure this behaviour for every end, namely
for leading and trailing gaps of the first and second sequence.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you reflow this section to make use of the 120 chars?

doc/tutorial/pairwise_alignment/index.md Outdated Show resolved Hide resolved
doc/tutorial/pairwise_alignment/index.md Outdated Show resolved Hide resolved
@@ -299,19 +277,26 @@ To make the configuration easier, we added a shortcut called seqan3::align_cfg::
\snippet doc/tutorial/pairwise_alignment/configurations.cpp include_edit
\snippet doc/tutorial/pairwise_alignment/configurations.cpp edit

The `edit_scheme` still has to be combined with an alignment method. When combining it
with the seqan3::align_cfg::method_global configuration element, the edit distance algorithm
can be further refined with free end gaps (see section `Global and semi-global alignment`).
Copy link
Member

@marehr marehr Aug 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there away to inter-reference this with doxygen?

Comment on lines +289 to +291
Using any other free end gap configuration will
disable the edit distance and fall back to the standard pairwise alignment and will not use the fast bitvector
algorithm.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reflow this section

### Refine edit distance

The edit distance can be further refined using seqan3::align_cfg::aligned_ends to also compute a semi-global alignment
and the seqan3::align_cfg::max_error configuration to give an upper limit of the allowed number of edits. If the
The edit distance can be further refined using the seqan3::align_cfg::max_error configuration to give an upper limit of the allowed number of edits. If the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really long all of a sudden?

!(method_global_cfg.free_end_gaps_sequence2_leading.get() ||
method_global_cfg.free_end_gaps_sequence2_trailing.get()) &&
(method_global_cfg.free_end_gaps_sequence1_leading.get() ==
method_global_cfg.free_end_gaps_sequence1_trailing.get()))
Copy link
Member

@marehr marehr Aug 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused, why is it .get in method_global_cfg.free_end_gaps_sequence1_leading.get()?

We said in the google docs that we want to have:

global.free_end_gaps_sequence1_leading = true/ false;
global.free_end_gaps_sequence2_leading = true/ false;
global.free_end_gaps_sequence1_trailing = true/ false;
global.free_end_gaps_sequence2_trailing = true/ false;

https://docs.google.com/document/d/1Lv7zxtDUgAeHGcnJ6IWFzgIbJ6rFJlUt2linAOjlIBg/edit#heading=h.6t469bw1p6zr

I don't find where you added these members in this PR, I must be blind.

EDIT:
AH these are strong-types, thus the .get(). Ah I thought that we internally only use the bools, since the member names already convey the meaning. But we didn't specify clearly what type global.free_end_gaps_sequence1_leading should be.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh you are right. Um, I just assumed to also use strong types but I have no strong feelings. Shall I change them to boolean?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this should be changed.

{
invalid_band |= (upper_diagonal < 0 || lower_diagonal > 0);
error_cause += " The first cell of the matrix is not enclosed by the band.";
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

@@ -100,23 +100,20 @@ class policy_alignment_matrix

bool invalid_band = upper_diagonal < lower_diagonal;
std::string error_cause = (invalid_band) ? " The upper diagonal is smaller than the lower diagonal." : "";
if constexpr (traits_t::with_free_end_gaps)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

@marehr
Copy link
Member

marehr commented Aug 14, 2020

Is every single commit "buildable"? If so could we split this PR into multiple ones, this would make it easier on the reviewers (I would review all of them, because I now know somehow what this big PR is about).

@smehringer
Copy link
Member Author

smehringer commented Aug 14, 2020

1st_place_medal For this very Ren[n]e like PR safety_vest construction_worker_man

I did my best.

This is just the first round, I have some problem to find the important changes. Even some https://www.youtube.com/watch?v=DU2bgWOln64 couldn't help me xD
I'll go commit-by-commit again, tomorrow.

Is this a follow-up PR of other changes that I'm maybe unaware of?

Yes sorry, it is the second part of seqan/product_backlog/issues/96, following PR #1938

Is every single commit "buildable"?

Yes

If so could we split this PR into multiple ones, this would make it easier on the reviewers (I would review all of them, because I now know somehow what this big PR is about).

Actually the commits do belong together quite strongly. In the first part in PR #1938
I just introduced the method_global configuration without using it and without initializing its free end gap members.
In this part I do (by commit):

  • Add the free and gap initialisation wherever it is needed in a method_global configuration
  • Adapt the "new alignment code" to actually use method_global instead of aligned_ends
  • Adapt the "old alignment code" to actually use method_global instead of aligned_ends
  • Delete all now unnecessary usages of the aligned_ends config
  • Documentation
  • Moving aligned_ends to detail::aligned_ends because it is still needed in the "old alignment code" (this probably fits better after adapt the old alignment code.. I can change that)

So I already tried to split this in commits although it is basically the single task of "replace aligned_ends with method_global".
I thought in single PRs the changes won't have much meaning but I can make single PR's if you like.

@smehringer smehringer requested a review from marehr August 14, 2020 13:01
@smehringer smehringer marked this pull request as draft August 18, 2020 05:46
@smehringer
Copy link
Member Author

superseeded by separate PRs

@smehringer smehringer closed this Sep 23, 2020
@smehringer smehringer deleted the alignment_configuration_free_end_gaps branch September 24, 2020 13:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adapt the free end gap configuration of the alignment
2 participants