- Allow users to keep original soft-mask with the new option
--keepOriginalMask
- Recommend the use of https://github.com/nf-core/pairgenomealign.
- Add a new
--lowmem
option to revert-S2
to-S1
.
- Paint poly-N regions in dotplots. Theses usually are contig boundaries.
- Experimental protein mode triggered when the seed is
PSEUDO
. Implementation may change in the future.
- Ignore failures of
last-train
instead of crashing. The failures are usually caused by the lack of similarity between target and query genomes, and it is better to let the other alignments proceed to the end.
-
The suffix of the trained parameter files is changed from
.par
to.train
. -
New experimental one to many mode (
--o2m
) which may be useful when all possible ways of alignment are needed. Note that the name of the output files may be changed in the future.
-
Use default
-D
value (total input length) forlast-train
. Keep-D1e9
forlastal
. -
Upgrade LAST to version 1541. This introduces seed-specific defaults for the maximum repeat unit length when the target genome is soft-masked by
lastdb
. -
Rename the
--seeding_scheme
option to just--seed
, which is shorter and easier to remember. -
Rename the
--skip_m2m
(defaulttrue
) option to just--m2m
(defaultfalse
). -
Rename the files and options to clearly indicate
m2m
,m2o
ando2o
.
-
Output a text-formatted trace file to profile resource usage.
-
Reduce the number of CPUs of
last-split
tasks to 2. -
Update to LAST 1522 to allow for
RY128
seeds.
-
--skip_m2m
now defaults totrue
. -
Add a
--one_to_one_only
option to prevent copying thelastal
alignment to the results folder, thus saving disk space. -
Add a
--lastal_extra_args
option to passlastal
arguments that are not recognised bylast-train
. -
Change the suffix of the parameter file from
par
to00.par
for better sorting of the file names. -
Stop providing a copy of the LAST index in the results folder.
-
Index both strands to speed up computation (at the expense of memory usage).
-
Add a
--last_split_mismap
option and revert the default to1e-5
. -
Update LAST to version 1519 and
windowmasker
to version 2.15.0. -
Default to soft-mask lowercased letters (option
-c
oflastdb
), and make the postmask step optional. -
Replace the
-E0.05
option (“Maximum expected alignments per square giga”) with-D1e9
(“Report alignments that are expected by chance at most once per LENGTH query letters”) to match the tutorials closer. Both options should have similar effects, but-D
is easier to explain.
- New
--read_align
option to utilise the pipeline for mapping query reads to a target genome.
-
New
--skip_m2m
to skip the generation of the many-to-many alignment, which consumes a large amount of time and disk space. -
Change default
-m
value oflast-split
to the default (-m1
at the moment) and add a new option--last_split_args
to allow setting other values (such as-m1e-5
that was used previously). -
New
--targetName
option to include target genome names in the output files.
- Guess index file name by searching for
prj
files and selecting the shortest base name. The previous method failed when the indexed genome was large enough to cause the generation of multipleprj
(ordes
) files. Version5.2.1
attempted to solve the problem but failed.
-
New
--dotplot_options
option to modify the dot plots. New default sort and orientation of the query genome (to match the alignment to the target genome). query genome sequence names are now written horizontally. -
In the README's examples, reversed the role of the target and query sequences for better demonstrating the new dotplot defaults.
- New
--with_windowmasker
option to soft-mask the genome with thewindowmasker
tool of the BLAST suite.
- Move postmask step at the end of the workflow, so that
last-split
has more information. - Use the new
--reverse
option oflast-split
so that the use ofmaf-swap
(and the files it generates) can be avoided. - Pass
fMAF+
to the fist call oflast-split
and-m1e-5
to the second call, as in the upstream cookbook. - Update LAST to version 1250.
- Use
YASS
as default seed and advertiseRY32
in the README.
- New options
--skip_dotplot_1
,_2
, and_3
to skip computationally expensive and not always so useful plots.
- Correct version number in
nextflow.config
and brush up documentation.
- Optionally pass a single alignment parameter file with a new
--lastal_params
option. Doing so skipslast-train
. - Re-implement the correction of 4.0.0 in a way that complies with nf-core, using a join operation.
- Important bug fix ensuring that the right trained parameter set is used with the right genome. (2c05b2de2da69864020fc4203f15d2fa14350d9c)
- Update LAST modules to the version accepted in nf-core.
- New
--query
option that saves the effort of creating a sample sheet when there is only one query genome.
- Force the score matrix to be symmetric (pass
--revsym
tolast-train
). - Allow passing common arguments to
last-train
andlastal
with the--lastal_args
option, defaulting to-E0.05 -C2
.
- Correct a bug that caused index names to not be detected properly
when seeding schemes such as
MAM8
are used.
- New
--seeding_scheme
that defaults toNEAR
. In previous versions thelastdb
command did not receive a parameter and defaulted toYASS
.
- Solve a channel bug that prevented processing more than one sample.
- Add dotplots.
- Set computation resources for process labels.
- Initial version.