You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running porechop, I came across unexpected output when identifying the barcodes on 10k sample reads.
It seems it takes the first 10k reads, however I concatenated the outputs of the albacore reads after demultiplexing, so they were ordered from barcode01->barcode12->unclassified.
So the first 10k reads were all barcode01.
When I ran porechop on this new shuffled file, it detected all the correct barcodes (better than albacore i might add) and seems to be running smoothly.
A feature request would be to modify the barcode detection function to randomly sample the ingested fastq. Otherwise note in the docs would do :)
cheers.
The text was updated successfully, but these errors were encountered:
Yes, this one could be solved either by specifying barcodes (#42) or by randomly subsampling the input reads.
In the meantime, don't forget that if you give Porechop a directory as input, it will look for all read files in that directory, and then it samples from each of them to avoid this issue. And as a bonus, if the directory looks like an Albacore directory with demultiplexing, Porechop will note the Albacore barcode and put reads in the 'none' bin if it and Albacore disagree. I find this useful for reducing mis-binned reads.
Ahh thanks for that.
I was trying to do some comparisons between algorithms, without being aware of each other.
So probably a low priority fix :) My shuffle script fixes the issue for now.
When running porechop, I came across unexpected output when identifying the barcodes on 10k sample reads.
It seems it takes the first 10k reads, however I concatenated the outputs of the albacore reads after demultiplexing, so they were ordered from barcode01->barcode12->unclassified.
So the first 10k reads were all barcode01.
I wrote a quick script to shuffle a fastq file (python 2.7ish) shuffle_fastq.py
see here: https://github.com/Psy-Fer/bioinf_tools
When I ran porechop on this new shuffled file, it detected all the correct barcodes (better than albacore i might add) and seems to be running smoothly.
A feature request would be to modify the barcode detection function to randomly sample the ingested fastq. Otherwise note in the docs would do :)
cheers.
The text was updated successfully, but these errors were encountered: