-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Small number of GAF annotations have wrong aspect (BP, MF, CC) #288
Comments
Thanks. Yes, where should I be reporting the changes in annotation format or other anomalies seen while running the GOATOOLS test suite? |
The helpdesk tracker is a good place to add those issues, we can transfer them to the right tracker as appropriate.
Thanks, Pascale |
Sorry my comment was about the obsolete terms. @kltm Can we add a rule to repair bad ontology aspects in the GAF files we export ? |
@pgaudet That's a possibility, and possibly easy, assuming that we have the closures already on hand. Tagging @dougli1sqrd to see if I'm correct. |
We actually have a Repair rule in place already GORULE:0000028, and this is running in our pipeline performing repairs as of May 2019. I found a place where this rule is working, even in the release you're referencing, @dvklopfenstein, here: http://release.geneontology.org/2019-07-01/reports/gramene_oryza-report.html#gorule-0000028. (Allow a few moments for the file to fully load, as that report is large.) So there's a minor mystery as to why it wouldn't be repairing in this case. The pipeline is looking at the
@kltm We do have a function in ontobio as well that can compute the aspect given it's place in the ontology by computing the ancestor closure, but it's not being used, favoring the metadata strategy above. |
Is this repair taking place at the parsing step ? Because I think the problem is with the 'predictions' files (for goa_human-prediction.gaf); are those re-parsed and repaired ? My understanding was that the 'prediction' software was passing on the original GO aspect rather than looking up the new one. Thanks, Pascale |
@pgaudet oh interesting, okay. Yeah, I'm less familiar with the predictions process, as owltools does that still. But no, they are not re-parsed and repaired. So in the original comment above, the listing of incorrect aspect based on file, those are actually the prediction versions of those files? |
Actually maybe that's not right. I am not sure we export the predictions ? |
Plus P01903 has an IDA to polysaccharide binding, so it would not be a prediction. (I think) |
It seems a bit suspicious that gorule-0000028 has 0 errors: We need to start implementing the examples again to make sure the rules are working. Pascale |
It hasn't always had zero errors. Previous releases for gramene_oryza had goruel-0000028 errors. |
SInce these are goa gafs, maybe this is an EBI issue @alexsign ? |
@ValWood Hi Val, can you please give me a real example you see anywhere in GOA files. We do not store aspect information in our database at all. Each annotation gets assigned to (F, P or C) from GO term itself on the unload. We can only can have them all right or all wrong. |
OK - I dont know what happened in the 2019-07 release. But in the current release, at least the example here Q30309
In any case, this is not an issue as of today. @dvklopfenstein hopefully you can use a newer version of the GO data. Thanks, Pascale |
@alexsign I think this is out of date - whatever bug seems fixed. |
OK maybe I'm wrong. It seemed odd that some of the annotations in the txt file listed above originate from GOA. This means, I assume that the namespace must get munged later? |
Thank you for GO and the annotations. It is crucial to be able to write scripts to manage gene products based on GO.
I am seeing a few annotations in the GAF files which have incorrect aspects (biological_process, molecular_function, and cellular_component).
For example, I only see one mismatch in goa_cow.gaf, Date Generated by GOC: 2019-07-01, on line 106,182 where the aspect for GO:0030247 is P, meaning biological_process, but the namespace in go-basic.obo (data-version: releases/2019-07-01) is molecular_function:
The attached file shows the other annotations with mismarked namespaces. The table below shows the quantity of mismatches per file.
namespace_errors.txt
The text was updated successfully, but these errors were encountered: