Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High NDR values when using a specific annotation #459

Open
N-Hoffmann opened this issue Jan 24, 2025 · 1 comment
Open

High NDR values when using a specific annotation #459

N-Hoffmann opened this issue Jan 24, 2025 · 1 comment

Comments

@N-Hoffmann
Copy link

Hello Bambu team, I hope that you are doing well in Singapore :)

I am currently working on a Nextflow pipeline that uses Bambu. Our goal is to extend user-provided reference annotations with novel genes and transcripts identified by Bambu. In one of pipeline steps, we filter out novel Bambu transcripts based on their NDR; by default, we keep transcripts that have an NDR < 0.2.

To test the pipeline with our Nanopore data (8 canine samples and 2 human samples), I am using the pipeline with different genome assemblies (human and canine) along with different reference annotations for each genome . The pipeline runs smoothly for each genome and annotation combination, except one. In this combination, I use a canine Ensembl annotation (https://ftp.ensembl.org/pub/release-113/gtf/canis_lupus_familiarisgsd/Canis_lupus_familiarisgsd.UU_Cfam_GSD_1.0.113.chr.gtf.gz) and the lowest NDR I get is 0.51 (I get 198027 transcripts). Using the same reference genome but with a refseq annotation, the lowest NDR I get is 0.0022, and 1583 transcripts out of 177385 have an NDR below 0.2. I obtain transcripts with an NDR below this threshold in all other combinations, except this one.

We checked the annotation file itself but didn't find anything unusual. I also ran my pipeline with two Ensembl annotations for other genome assembly versions, and it works as expected. I also remapped my samples to be sure, but it didn't change the NDR values. I checked the output of Bambu, but it runs fine. I am using Bambu version v.3.4.1 .

I was wondering if you had an idea for why I get these higher NDR values for this specific annotation ?
Thanks for any help !
Nicolaï

@cying111
Copy link
Collaborator

Hi @N-Hoffmann ,

thanks for reporting the issue, and sorry for getting back lately.

I had a look at the annotations, it all looks good to me. The results do seem a bit suspicious, but it does happen sometime when reads are of low quality, or low depth. I know you have have checked already, but what is the number of reads and mapping quality of this sample that is aligned to the particular annotation? When you say it runs fine with refseq, did you use the same sample?

If you don't mind, you can also send me your data (bam file, genome fasta) for this particular sample, and I will investigate further for you.

Thank you
Warm regards,
Ying

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants