-
Notifications
You must be signed in to change notification settings - Fork 26
TAMA GO: Sequence Cleanup
This set of tools in TAMA-GO is used to clean up sequences. Right now there is only one tool but it will be expanded later.
tama_flnc_polya_cleanup.py
To remove poly-A tail sequences from the FLNC fasta files use tama_read_support_levels.py. This tool is used to remove the poly-A tails left in the FLNC fasta files after running IsoSeq3 Refine without the "--require-polya" parameter. If you have Iso-Seq data generated from cDNA libraries prepared with the Teloprime kit, you should not use the "--require-polya" parameter. Using the "--require-polya" parameter will remove many reads due to an issue with the Teloprime 3' primer sequence and the way LIMA works. Instead you should run default Refine and then clean up the remaining Poly-A tails using this tool.
In order to convert the FLNC BAM file into a fasta file you can use this command: bamtools convert -format fasta -in bam_file > fasta_file
Note: This is not a part of TAMA. This is bamtools.
usage: tama_flnc_polya_cleanup.py [-h] [-f] [-p]
optional arguments:
-h, --help show this help message and exit -f F FLNC fasta file -p P Prefix for output file
Default command would look like this:
python tama_flnc_polya_cleanup.py -f flnc.fa -p prefix
Detailed explanation of arguments:
-f F
The FLNC fasta file is the output from running IsoSeq3 Refine and then the BAM to Fasta conversion.
-p P
This is the prefix used for the file naming of all the output files.
Outputs:
prefix.fa prefix_polya_flnc_report.txt
Detailed explanation:
prefix.fa
This is the cleaned up FLNC fasta file.
prefix_polya_flnc_report.txt
This is a report file showing a table of the number of sequences with different counts of poly-A's.
polya_num polya_num_count 0 40676 1 46986 2 63718