Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalizing NAACCR: deduplicating #125

Closed
mgurley opened this issue Oct 7, 2019 · 8 comments
Closed

Normalizing NAACCR: deduplicating #125

mgurley opened this issue Oct 7, 2019 · 8 comments

Comments

@mgurley
Copy link
Collaborator

mgurley commented Oct 7, 2019

Proposal: mapping all duplicate concepts to a new OMOP generated concepts. Ultimately, they should be mapped to a standard concept, but we have not determined them yet. See #9

This process was started in #51. Should be revisited and incorporate prior work.

@mgurley
Copy link
Collaborator Author

mgurley commented Oct 8, 2019

Drive this based on use case

@denyskaduk
Copy link
Collaborator

denyskaduk commented Oct 16, 2019

Hello! In continue of dedup process, I want to propose dedupping variables which contain stages information:

concept_code concept_name standard_code standard_name
1022 AJCC TNM Post Therapy N 815 Derived EOD 2018 N
2960 Derived AJCC-6 N 815 Derived EOD 2018 N
3410 Derived AJCC-7 N 815 Derived EOD 2018 N
3482 Derived PostRx-7 N 815 Derived EOD 2018 N
3450 Derived PreRx-7 N 815 Derived EOD 2018 N
815 Derived EOD 2018 N 815 Derived EOD 2018 N

But according to this I have a question, if it's good to dedup concept which contain information related to treatment, like 'AJCC TNM Post Therapy N'?

@cgreich
Copy link
Collaborator

cgreich commented Oct 17, 2019

@denyskaduk:

Why not just "Derived N" or "N"?

And your question: That is a typical problem we have, that a concept contains not just a fact (derived N), but the fact and some longitudinal context. Similar to ICD-10 codes "some condition, first encounter". We usually strip this out. So, the fact there was a therapy is irrelevant to define the N.

@dimshitc
Copy link
Collaborator

So, we map it to LOINC: Regional lymph nodes.clinical [Class] Cancer
or Regional lymph nodes.pathology [Class] Cancer
Right?
BTW, @denyskaduk can we know somehow, are the concepts given describe clinical or pathological staging?

@denyskaduk
Copy link
Collaborator

@dimshitc, Unfortunately NAACCR didn't give such information for this Items. @cgreich also this vocabulary didn't give us concepts with non-specific names, we can create them, but as from previous discussion I remember, it isn't a good choice or we can create a new Standard concept in this case?

@denyskaduk
Copy link
Collaborator

As we already started to talk about mapping to our standard vocabularies, I want to suggest one more mapping case:

concept_code concept_name vocabulary_id standard_code standard_name standard_vocabulary
testis@2868 Post-Orchiectomy Human Chorionic Gonadotropin (hCG) Lab Value NAACCR 67900009 Human chorionic gonadotropin measurement SNOMED
testis@2862 Pre-Orchiectomy Human Chorionic Gonadotropin (hCG) Lab Value NAACCR 67900009 Human chorionic gonadotropin measurement SNOMED
testis@2890 OBSOLETE - Human Chorionic Gonadotropin (hCG) NAACCR 67900009 Human chorionic gonadotropin measurement SNOMED
3846 hCG Post-orchiectomy Lab Value NAACCR 67900009 Human chorionic gonadotropin measurement SNOMED
3848 hCG Pre-orchiectomy Lab Value NAACCR 67900009 Human chorionic gonadotropin measurement SNOMED

@cgreich
Copy link
Collaborator

cgreich commented Oct 19, 2019

@denyskaduk: I understand the problem. This is not just a matter of picking a useful Standard, we would have to create it. And then we are in the business of vocab production. Urgh. Well, let's take a look how many of these we would need. Do you have an idea?

@sratwani
Copy link
Collaborator

Closing. Several other tasks are created and assigned to the label 'De-duplicating NAACCR'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants