-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The isoform classification is not accurate for Genic Intron and Intergenic #255
Comments
Hi @dolittle007, Could you provide an example of this bug? Or highlight/do a pull request of the part of the code that is causing this bug? Thanks, |
Example:
Hi @alexpan00, I am providing the example data in the example.zip file.
According to the Thanks a lot. |
Hi @dolittle007, Thanks for your detailed example. I am looking at this right now and let you now as soon as possible. Alejandro |
Hi @dolittle007, We have identified the cause of the problem and working on a fix. However, the third isoform of your example would be classified as a genic intron, because it does not overlap with any exon of the transcript in the other strand. This is necessary for an isoform to be classified as antisense. Thank you, |
Hi @alexpan00, |
Hi, @alexpan00 |
Hi @Tang-pro Is the reference shorter than 200bp? Alejandro |
Hi, @alexpan00 |
Hi @Tang-pro In the example that you are showing the reference transcript seems to be quite short, probably under 200 bp, so that isoform is discarded while parsing the reference annotation with the previous --min_ref_len default of 200. If the issue persists, please, open a new issue. |
Hi,@alexpan00 |
Sorry, @Tang-pro, but I am not sure if I fully understood your answer. You are saying that there are isoforms in your gtf that overlap a transcript in the reference annotation that is longer than 200bp while the isoform is being classified as intergenic, is this correct? If that is the case, please, open a new issue and provide some examples (in gtf format if possible). The idea of the 200bp cutoff was introduced to avoid hits with families of small RNAs. After getting a couple of issues reporting incorrect classification on isoforms as intergenic (when the reference is shorter than 200bp), we decided that it may be better to set the default value to 0 and give the user full control of the cutoff. |
Hi, @alexpan00 |
Fixed a bug on classification of genic_intron (issue #255)
This issue with the genic intron has been fixed in the last commit #283 |
Hi, I want to ask if this bug is fixed in the latest SQANTI version 5.2.1? I have updated my previous 5.2 to 5.2.1, however this bug still persists |
Hi, Alejandro |
Hi @MoMoTPark, This change has not been included in a release yet. You should clone the repo in order to get this change. Sorry for the inconvenience, |
Hi Alejandro, I see, thank you very much for the prompt reply. I'll do that then. Many thanks, |
Fixed a bug on classification of genic_intron (issue #255)
Hi SQANTI3 developers,
I found that SQANTI3 usually identifies genic intron as intergenic by mistake. Could you please fix this issue?
SQANTI3/sqanti3_qc.py
Lines 1335 to 1381 in a9b6604
Thanks a lot.
The text was updated successfully, but these errors were encountered: