-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support looking up terms by OBO Id #46
Comments
BioPortal doesn't have OBO version of the NCIT at the moment and owl version use |
Please could we use In general the pattern is {OntologyAcronym}:{TermCode}. The ontology with {OntologyAcronym} as it's acronym should always be the top hit, ignoring all other metrics. The more I work with these kinds of terms and searches for them the more critical I think this is for RADx. It is also useful for BioPortal in general and anyone working with OBO ontologies. |
More details (in place here)... We are looking for terms that have this IRI pattern: |
@mdorf Could you please break down this issue into a list of low-level tasks, and provide an estimate of the time needed for completion? |
Is there a generic algorithm that would apply to any onotlogy ID? Say, {OntologyAcronym}:{Last Fragment of ID}? |
Probably something like this:
|
I think this could be true. |
Just a note... because multiple ontologies can reuse terms something like
|
@matthewhorridge, below are some example terms from different ontologies that we went over in the meeting. Would it be possible for you to fill in the results for each example that show what the short ID would look like? Also, if you can document the general rules for extracting these short IDs, it would be great. Ideally, this solution should handle ALL variations of ontologies in our system to be relatively generic. Acronym: NCIT Acronym: LOINC Acronym: DRON Acronym: RXNORM Acronym: ONS Acronym: BAO Acronym: GFO Acronym: UNITSONT Acronym: ICF Acronym: EDAM Acronym: PMA Acronym: NDF-RT |
@mdorf I have updated your post with results. The basic algorithm is:
This regex might be a little bit conservative but I'd prefer to stick to this for now. I think the name of the field should be OboId (i.e. where we currently have result). If the Note that NCIT is a special case here because we don't have the OBO version of it in BioPortal |
Thank you, @matthewhorridge for documenting these. It seems NCIT is a special case for both rules 1. and 2., correct? It does not match the regex AND the prefix of the OBO ID is formed using the ontology acronym instead of the $1 match. |
Yes, that's right. One possibility is that if the above rules fail to match the regex then, If the term ID starts with the ontology IRI, (1) remove the ontology IRI matching part, (2) next remove the first character of the remaining part ( Given, (1) Remove |
The only issue with that is that we don't expose the ontology IRI via our API or store it as metadata at the moment. |
These rules require implementing an additional functionality in BioPortal that would allow retrieving the ontology IRIs. Timeline adjusted accordingly... |
As a first step would it be possible to implement the functionality that does not require looking at the ontology IRI? (So do this as a two step implementation) |
@matthewhorridge, we actually have an existing pull request from AgroPortal that implements the ability to retrieve the ontology IRI. The only issue is this PR is two years old, and some code has diverged from its original implementation, which requires a bit of manual work during the merge. I don't expect it to be a huge undertaking, so my recommendation is to roll it in as part of this development. It's also a very important and useful metadata attribute to be exposed via the BioPortal API. |
Short ID search enhancement has been deployed in BioPortal |
The RADx Data Dictionary Specification allows ontology terms to be provided for Data Elements and Enumerations. These terms can be provided in a short form as OBO Ids. For example,
NCIT:C16670
.When I search for terms in BioPortal I get no results, e.g.
This should return the exact match of the term with the id in first position.
The text was updated successfully, but these errors were encountered: