-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing lots of triples caused by overly selective query #41
Comments
Hi @fpriyatna I am skeptical on the "legitimity" of theses CUI, the ones found in MRREL and involved in a relation compared to the ones in MRCONSO. Typically, what information would you have for these CUIs if they are not in MRCONSO... the label (STR) or language (LAT) or code (CODE) or definiton in MRDEF. UMLS2RDF is made to provide such information about a class. Can you provide a few exemple of CUIs found in MRREL and not present in MRCONSO and then provide their label, language and other info? |
Thank you for your quick reply @jonquet ! For example, I need to find some drug interaction information, in which that I can get from the MRREL table with this query: However, once we know the cui of the drug, we dont really need to get the concepts from the same datasource. Thus, in order to get the drug information, we can do something like What do do you think? |
Indeed C0000294 is not in MED-RT. Another way to say this : In the original MED-RT, there is no i definition of the relation you have found for this drug in the MRREL table, therefore it is normal it is not exported. |
But UMLS is a meta-thesaurus. It provides relationships between concepts in difference source systems. I don't think anyone would be upset if they used this tool and exported MED-RT as RDF and got triples that refer to concepts in other source systems. In fact, I think many users are looking for that. At the very least I think the referenced PR should be enabled by a configuration item. |
I understand your point @justin2004 Maybe indeed the system should have been called differently ;) The idea that NCBO has when developing this tool was to consider UMLS as a "unified source" of different biomedical terminologies that could be loaded one by one in the NCBO BioPortal which is a project that has taken a "per ontology approach" not a "meta approach" as the UMLS did. This was debated several times years ago ... @mamusen can witness Therefore, it's super important that despite the source information is taken out from UMLS Metathesaurus, its only what was in the original source that is included in the output of ums2rdf I do agree that some knowledge, coming from the tremendous amount of work done by the Metatheasurus is lost. To avoid confusion, as much as possible every time a RDF source file exist outside of UMLS Metathesaurus (for a source that is also in the UMLS) BioPortal will usually load this file one rather than use the umls2rdf to extract it from UMLS. This is the cas in BioPortal for NCIT or GO. |
Also just to clarify, the aforementioned concepts are in umls, but they are in relation table, not in concept table |
Background: There exist records of drug interactions for MED-RT datasource as shown by this query: SELECT * FROM MRREL WHERE SAB = 'MED-RT' AND RELA = 'has_contraindicated_drug';
Issue: The main table that stores the UMLS concepts, MRCONSO, does not store the MED-RT code for those resources in background 1 query.
Cause: so the reason of why some UMLS properties do not appear in the triples is that because although those properties exist in the relationship table (MRREL), the associated concepts (CUI column) in the corresponding datasource (SAB column) are not stored in the concept table (MRCONSO).
The above example applies to other datasources as well, not just MED-RT.
The text was updated successfully, but these errors were encountered: