Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trimming Suggestion for 16S V3 #1791

Closed
Jiayonglai123 opened this issue Aug 2, 2023 · 4 comments
Closed

Trimming Suggestion for 16S V3 #1791

Jiayonglai123 opened this issue Aug 2, 2023 · 4 comments

Comments

@Jiayonglai123
Copy link

Hi, I'm a newbie in dada2 pipeline,
my samples are from Illumina Miseq 100, 16S V3 in casava single end so no merging is required,
my samples are 16S V3 in casava single end so no merging is required,
primers were cut using cutadapt before going into dada2 for quality check and downtrim analysis.
I read through some issues posted in this github, can I get some suggestion for the result?

  1. should I cut the length in according to each sample or should I find the best cut for all?
  2. can I ask about the black/ grey colored part means?
    image
  3. I try cutting it at 160bp, but in error rate seems not so good, should I try lower value to trim( around 135bp?)
  4. for plotQualityProfile() can I do something like a interactive ploty so that I can cut the best quality for all samples.
    Thanks beforehand
@benjjneb
Copy link
Owner

benjjneb commented Aug 3, 2023

  1. Same for all
  2. It is the heatmap of the quality scores. The plots you posted strongly suggest binned quality scores. How sure are you that this sequencing was done on a MiSeq? (vs. a MiniSeq or NovaSeq etc)
  3. truncLen will throw away all reads that don't reach the truncation length, and your pre-processing has resulted in a majority of reads not being 160nts long (see the red line -- the cumulative distribution of read lengths). Given your preprocessing, only truncLen shorter than the first drop around 135 seem appropriate.
  4. We don't have interactive plots implemented in the package.

@Jiayonglai123
Copy link
Author

Jiayonglai123 commented Aug 4, 2023

Thanks for your prompt reply, and good insights,

  1. about the heatmap, does is meant that the darker the map is the higher the quality of reads I got?
  2. I just checked with my sequencing provider, it's Illumina's iSeq 100 (thanks for pointing out my mistake)
  3. really apprieciate your time and efforts to answer all the questions in the forum with quick reply speed.
  4. I just tried out the error rate estimation, while it is not as nice as shown in the tutorial, witht truncate of 135 it has less error out at the bottom of the plot, do you think it is good to proceed with the next step with this plot?
    image

@benjjneb
Copy link
Owner

benjjneb commented Aug 4, 2023

the darker the map is the higher the quality of reads I got?

The darker the cell in the heatmap, the more reads had that quality at that position.

I just tried out the error rate estimation, while it is not as nice as shown in the tutorial, witht truncate of 135 it has less error out at the bottom of the plot, do you think it is good to proceed with the next step with this plot?

You are OK to proceed. The weird quality score fitting with binned Q scores is a known issue, but doesn't seem to affect operation of dada2 much. You can read more: #791

@Jiayonglai123
Copy link
Author

Thanks for your help and support.
Sincere gratitute for your prompt and quick reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants