-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alignment configuration free end gaps #2032
Closed
smehringer
wants to merge
9
commits into
seqan:release-3.0.2
from
smehringer:alignment_configuration_free_end_gaps
Closed
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
20abb40
[MISC] Change member variables of method_global to booleans.
smehringer 445bf42
[MISC] Add free end gap configuration to method_global calls.
smehringer e32caab
[MISC] Use method_global free_end gap config in policies.
smehringer 10adfc3
rebase: change strong type usage to boolean in policies.
smehringer ae5f6a1
[MISC] Adapt the alignment_configurator to use the method_global free…
smehringer 3f226d4
rebase: change strong type usage to boolean in configurator.
smehringer 257b444
[MISC] Delete all usage of seqan3::align_cfg::aligned_ends.
smehringer 365e905
[DOC] Adapt documentation of the new way to specify free end gaps.
smehringer e8ed941
[MISC] Move aligned_ends to align_cg::detail.
smehringer File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -110,45 +110,23 @@ seqan3::align_cfg::method_global. | |
\remark The method configuration must be given by the user as it strongly depends on the application context. | ||
It would be wrong for us to assume what the intended default behaviour should be. | ||
|
||
The global alignment can be further refined by setting the seqan3::align_cfg::aligned_ends option. | ||
The seqan3::align_cfg::aligned_ends class specifies wether or not gaps at the end of the sequences are penalised. | ||
In SeqAn you can configure this behaviour for every end (front and back of the first sequence and second sequence) | ||
separately using the seqan3::end_gaps class. | ||
This class is constructed with up to 4 end gap specifiers (one for every end): | ||
|
||
- seqan3::front_end_first - aligning front of first sequence with a gap. | ||
- seqan3::back_end_first - aligning back of first sequence with a gap. | ||
- seqan3::front_end_second - aligning front of second sequence with a gap. | ||
- seqan3::back_end_second - aligning back of second sequence with a gap. | ||
|
||
These classes can be constructed with either a constant boolean (std::true_type or std::false_type) or a regular `bool` | ||
argument. The former enables static configuration of the respective features in the alignment algorithm. The | ||
latter allows to configure these features at runtime. This makes setting these values from runtime dependent parameters, | ||
e.g. user input, much easier. The following code snippet demonstrates the different use cases: | ||
|
||
\snippet doc/tutorial/pairwise_alignment/configurations.cpp include_aligned_ends | ||
\snippet doc/tutorial/pairwise_alignment/configurations.cpp aligned_ends | ||
|
||
The `cfg_1` and the `cfg_2` will result in the exact same configuration of the alignment where aligning the front of | ||
either sequence with gaps is not penalised while the back of both sequences is. The order of the arguments is | ||
irrelevant. Specifiers initialised with constant booleans can be mixed with those initialised with `bool` values. | ||
If a specifier for a particular sequence end is not given, it defaults to the specifier initialised with | ||
`std::false_type`. | ||
|
||
\note You should always prefer initialising the end-gaps specifiers using the boolean constants if possible | ||
as it reduces the compile time. The reason for this is that the runtime information is converted into static types | ||
for the alignment algorithm. For every end-gap specifier the compiler will generate two versions for the `true` and the | ||
`false` case. This adds up to 16 different paths the compiler needs to instantiate. | ||
|
||
SeqAn also offers \ref predefined_end_gap_configurations "predefined" seqan3::end_gaps configurations that | ||
cover the typical use cases. | ||
|
||
| Entity | Meaning | | ||
| -------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------| | ||
| \ref seqan3::end_gaps::free_ends_none "free_ends_none" | Enables the typical global alignment. | | ||
| \ref seqan3::end_gaps::free_ends_all "free_ends_all" | Enables overlap alignment, where the end of one sequence can overlap the end of the other sequence. | | ||
| \ref seqan3::end_gaps::free_ends_first "free_ends_first" | Enables semi global alignment, where the second sequence is aligned as an infix of the first sequence. | | ||
| \ref seqan3::end_gaps::free_ends_second "free_ends_second" | Enables semi global alignment, where the first sequence is aligned as an infix of the second sequence. | | ||
The global alignment can be further refined by initialising the configuration element with | ||
the free end gap specifiers. They specify whether gaps at the end of the sequences are penalised. | ||
In SeqAn you can configure this behaviour for every end, namely | ||
for leading and trailing gaps of the first and second sequence. | ||
seqan3::align_cfg::method_global is constructed with 4 free end gap specifiers (one for every end): | ||
|
||
- seqan3::align_cfg::free_end_gaps_sequence1_leading - If set to true, aligning leading gaps in first sequence is not penalised. | ||
- seqan3::align_cfg::free_end_gaps_sequence2_leading - If set to true, aligning leading gaps in second sequence is not penalised. | ||
- seqan3::align_cfg::free_end_gaps_sequence1_trailing - If set to true, aligning trailing gaps in first sequence is not penalised. | ||
- seqan3::align_cfg::free_end_gaps_sequence2_trailing - If set to true, aligning trailing gaps in second sequence is not penalised. | ||
|
||
The following code snippet demonstrates the different use cases: | ||
|
||
\snippet doc/tutorial/pairwise_alignment/configurations.cpp include_method | ||
\snippet doc/tutorial/pairwise_alignment/configurations.cpp method_global_free_end_gaps | ||
|
||
The order of arguments is fixed and must always be as shown in the example. | ||
|
||
\assignment{Assignment 2} | ||
|
||
|
@@ -161,8 +139,8 @@ would be aligned as an infix of the second sequence. | |
|
||
\include doc/tutorial/pairwise_alignment/pairwise_alignment_solution_2.cpp | ||
|
||
To accomplish our goal we simply add the align_cfg::aligned_ends option initialised with `free_ends_first` to the | ||
existing configuration. | ||
To accomplish our goal we initialise the `method_global` option with the free end specifiers | ||
for sequence 1 set to `true`, and those for sequence 2 with `false`. | ||
|
||
\endsolution | ||
|
||
|
@@ -299,19 +277,26 @@ To make the configuration easier, we added a shortcut called seqan3::align_cfg:: | |
\snippet doc/tutorial/pairwise_alignment/configurations.cpp include_edit | ||
\snippet doc/tutorial/pairwise_alignment/configurations.cpp edit | ||
|
||
The `edit_scheme` still has to be combined with an alignment method. When combining it | ||
with the seqan3::align_cfg::method_global configuration element, the edit distance algorithm | ||
can be further refined with free end gaps (see section `Global and semi-global alignment`). | ||
|
||
\attention Only the following free end gap configurations are supported for the | ||
global alignment configuration with the edit scheme: | ||
- no free end gaps (all free end gap specifiers are set to `false`) | ||
- free end gaps for the first sequence (free end gaps are set to `true` for the first and | ||
to `false` for the second sequence) | ||
Using any other free end gap configuration will | ||
disable the edit distance and fall back to the standard pairwise alignment and will not use the fast bitvector | ||
algorithm. | ||
Comment on lines
+289
to
+291
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. reflow this section |
||
|
||
### Refine edit distance | ||
|
||
The edit distance can be further refined using seqan3::align_cfg::aligned_ends to also compute a semi-global alignment | ||
and the seqan3::align_cfg::max_error configuration to give an upper limit of the allowed number of edits. If the | ||
The edit distance can be further refined using the seqan3::align_cfg::max_error configuration to give an upper limit of the allowed number of edits. If the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is really long all of a sudden? |
||
respective alignment could not find a solution within the given error bound, the resulting score is infinity | ||
(corresponds to std::numeric_limits::max). Also the alignment and the front and back coordinates can be computed using | ||
the align_cfg::result option. | ||
|
||
\attention Only the options seqan3::free_ends_none and seqan3::free_ends_first | ||
are supported for the aligned ends configuration with the edit distance. Using any other aligned ends configuration will | ||
disable the edit distance and fall back to the standard pairwise alignment and will not use the fast bitvector | ||
algorithm. | ||
|
||
\assignment{Assignment 6} | ||
|
||
Compute all pairwise alignments from the assignment 1 (only the scores). Only allow at most 7 errors and | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there away to inter-reference this with doxygen?