You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am very sorry if my question is wrong/incorrect, maybe someone can just point me in the right direction. I am using cutadapt for trimming some PCR primers from my sequencing results. Our samples look something like this:
Using those primers you see, I add on Illumina sequencing adaptors, followed by Indexing/Universal Primers for sequencing. I then sequence using paired-end reads of 151 length. Correspondendly, I have 2 fastq files, containing the R1 and R2 read. I now want to cut the specific primers you see here from the R1 read, as well as using the paired-end trimming to make sure that each sequence got read twice during sequencing. I am using Cutadapt 4.3 on Ubuntu.
The primers I want to trim:
5': CGTCCATAGCGCAAATC
3': CTTCACTGGGCTTGTCA
The command I think I have to use:
cutadapt -g CGTCCATAGCGCAAATC -G TGACAAGCCCAGTGAAG -a CTTCACTGGGCTTGTCA -A GATTTGCGCTATGGACG --pair-filter=both -o 1.fastq -p 2.fastq Input1 Input2
For some reason, my output for R1 always looks like this:
CGTCCATAGCGCAAATCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
and output of R2 like this:
TGACAAGCCCAGTGAAGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
Even if I try to only do single-end trimming with one of the files, I still get these results (for example only input R1 with the same -a and -g flags as above). I also tried this:
cutadapt -a CTTCACTGGGCTTGTCA -A GATTTGCGCTATGGACG --pair-filter=both -o 1.fastq -p 2.fastq Input1 Input2
which yields a similar result as the command that contains the -g and -G flags. Can anyone helpe me solve this? I am confident I could technically solve this by just manually deleting the leading 17 nt of each read, but I feel like there must be a way to do this with cutadapt.
Best,
Corbin
The text was updated successfully, but these errors were encountered:
You don’t need the --pair-filter option. This is only relevant when you use an option that filters the reads such as --discard-untrimmed. And if you use --discard-untrimmed, you should not use --pair-filter=both, but leave it at the default (which is the same as --pair-filter=any) because you want the entire pair to be discarded if any of the two reads was untrimmed (that is, the pair is only kept if both primers were found).
Happy to help further if the above didn’t help, but please read that section first.
Thank you so much. I was only searching in the User guide. This (kind of) worked. For some reason, specifically my R2 reads have more errors in the PCR sequence, so they are not as cleanly removed as the R1 read primers. Still, I was able to remove most of them from both reads using -e 0.2 in addition to the method described in the link above.
Once again, thank you for linking me the correct dokumentation, this solved my problem (and finally stopped me from trimming each read individually, twice)
Hello,
I am very sorry if my question is wrong/incorrect, maybe someone can just point me in the right direction. I am using cutadapt for trimming some PCR primers from my sequencing results. Our samples look something like this:
CGTCCATAGCGCAAATCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCTTCACTGGGCTTGTCA
Using those primers you see, I add on Illumina sequencing adaptors, followed by Indexing/Universal Primers for sequencing. I then sequence using paired-end reads of 151 length. Correspondendly, I have 2 fastq files, containing the R1 and R2 read. I now want to cut the specific primers you see here from the R1 read, as well as using the paired-end trimming to make sure that each sequence got read twice during sequencing. I am using Cutadapt 4.3 on Ubuntu.
The primers I want to trim:
5': CGTCCATAGCGCAAATC
3': CTTCACTGGGCTTGTCA
The command I think I have to use:
cutadapt -g CGTCCATAGCGCAAATC -G TGACAAGCCCAGTGAAG -a CTTCACTGGGCTTGTCA -A GATTTGCGCTATGGACG --pair-filter=both -o 1.fastq -p 2.fastq Input1 Input2
For some reason, my output for R1 always looks like this:
CGTCCATAGCGCAAATCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
and output of R2 like this:
TGACAAGCCCAGTGAAGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
Even if I try to only do single-end trimming with one of the files, I still get these results (for example only input R1 with the same -a and -g flags as above). I also tried this:
cutadapt -a CTTCACTGGGCTTGTCA -A GATTTGCGCTATGGACG --pair-filter=both -o 1.fastq -p 2.fastq Input1 Input2
which yields a similar result as the command that contains the -g and -G flags. Can anyone helpe me solve this? I am confident I could technically solve this by just manually deleting the leading 17 nt of each read, but I feel like there must be a way to do this with cutadapt.
Best,
Corbin
The text was updated successfully, but these errors were encountered: