Skip to content

Releases: ccb-hms/NHANES-metadata

v3.11.1

16 May 15:38
Compare
Choose a tag to compare

Change Log

Internal changes:

  • Fix issue with oral health mappings table which missed a newline at EOF causing DB import failure (closes #35).

v3.11.0

05 Apr 17:28
Compare
Choose a tag to compare

Change Log

Internal changes:

  • Decouple ontology mapping details from nhanes_variables.tsv table to a separate table called nhanes_variables_processed.tsv, which is used downstream for ontology mapping (closes #32).
    • specifically, the columns ProcessedText and Tags have been removed from nhanes_variables.tsv.
  • Add the expert-verified oral health phenotype mappings to the nhanes_variables_mappings.tsv table with the tag "human verified" (closes #34).

v3.10.0

08 Feb 20:12
Compare
Choose a tag to compare

Change Log

Internal changes:

  • Modify blocklist to include all variables that match patterns identified in phonto's vignettes (VariableClassification.md) (closes #31).
  • Fix bug where unmapped terms were duplicated in the output mappings table (closes #25).
  • Investigate whether replacing labels of oral health variables with their synonyms and remapping them would yield more accurate mappings, which it did not (closes #26).
  • Add the oral health, ontology-mapped variables to blocklist.csv such that they no longer need to be mapped (closes #33).
  • Fix issue where the value for Use Constraints contained newline characters or was empty when it should be "None".

v3.9.1

13 Jan 00:38
Compare
Choose a tag to compare

Change Log

User-facing changes:

  • Modify the metadata pipeline to output nhanes_variables.tsv with capitalized TRUE/FALSE in the IsPhenotype and OntologyMapped columns (closes #29).
  • Update the metadata README to describe in more the detail the metadata acquisition process and additional processing done on the metadata.

v3.9.0

05 Jan 00:04
Compare
Choose a tag to compare

Change Log

User-facing changes:

  • Add a column to the nhanes_variables.tsv table called IsPhenotype that specifies if a variable is a phenotype or not (closes #15).
  • Add oral health variable mappings and synonyms contributed by the Harvard School of Dental Medicine (closes #1).
    • Add table metadata/synonym_table.tsv of 'variable label synonyms' containing alternate descriptions of variables in oral health tables.
    • Add table ontology-mappings/nhanes_oral_health_mappings.tsv of human-verified mappings for variables in oral health tables.

Internal changes:

  • Add non-phenotype templates in phonto vignettes to our (non-phenotype) blocklist (#15).
  • Add new patterns of non-phenotypes found in low-scored mappings (#16).
  • Add variables with only EnglishInstructions (e.g., "check item" variables or instructions-only variables) to the non-phenotype "blocklist" (closes #23).

v3.8.2

18 Dec 21:27
Compare
Choose a tag to compare

Change Log

Internal changes:

  • Quote values of variables metadata table (closes #24).
  • Drop the intermediate VariableID column, which was not meant to be output.

v3.8.1

06 Dec 16:05
dfc735a
Compare
Choose a tag to compare

Change Log

User-facing changes:

  • Add missing data groups for tables that are missing that detail when extracted through nhanesA (closes #19).

v3.8.0

05 Dec 00:32
Compare
Choose a tag to compare

Change Log

User-facing changes:

  • Update Metadata tables to contain all Variable and Table IDs in uppercase (closes #20).
  • Add EnglishInstructions column to nhanes_variables.tsv table.
    • Some variables do not have a name or description but contain some instructions.
  • Update ontology versions to EFO v3.60.0 and NCI Thesaurus v23.09d

Internal changes:

  • Use nhanesA::nhanesManifest() to obtain all NHANES tables (closes #9 and #11).
    • Drop use of slow nhanesSearchTableNames().
  • Remove duplicates from variable codebooks table (closes #18).
  • Fixes issue where empty codebooks were saved for some variables (closes #17).

v3.7.0

03 Nov 19:44
Compare
Choose a tag to compare

Change Log

User-facing changes:

  • Add pandemic (P_) tables.
  • Add NHANES tables metadata table to nhanes_metadata.db.
  • Add CC-BY 4.0 License.

Internal changes:

  • Use nhanesA from its latest state on GitHub.

v3.6.0

19 Sep 19:03
Compare
Choose a tag to compare

Change Log

User-facing changes:

  • Update the variable codebooks table nhanes_variables_codebooks.tsv with additional codebooks that were previously unattainable using nhanesA, and which we can now get with nhanesA_0.8.0.
  • Add disease locations associated with NCIT and FOODON terms.
    • Also includes locations expressed in universal restrictions (i.e., 'pancreas disease' disease_has_location only 'pancreas').
  • Update ontology set to use EFO v3.57.0, which results in updated ontology mappings (or just updated scores).
  • Include DIsease Ontology (DOID) terms as potential mapping targets.
  • Modify prototype search module nhanes_metadata_search.py to allow queries with multiple search terms.

Internal changes:

  • Use optimized nhanesA v0.8.0 for faster codebook retrieval.
  • Add dedicated module to build a sqlite DB of the metadata, which outputs a (xz compressed) sqlite database file containing all the generated tables.
    • This facilitates browsing and testing the generated tables.
    • The resulting database is used with nhanes_metadata_search.py.
  • Use SemSQL gzip database distributions, which provide more up to date ontology versions.
  • Add argument to include (or not) disease locations associated with ontology terms.
  • Update conditions used to flag if variables are mapped, since we include now in the output table all variables even if they have not been mapped (these get a mapping score of 0).