Releases: ccb-hms/NHANES-metadata
Releases · ccb-hms/NHANES-metadata
v3.5.0
Change Log
User-facing changes:
- Add an additional table
ontology_synonyms.tsv
containing synonyms of all ontology terms.- The table format follows other tables and looks like so:
Subject | Object | Ontology
, whereSubject
is the ontology term (CURIE),Object
is the synonym string, andOntology
is the source ontology of the synonyms.
- The table format follows other tables and looks like so:
- Add a column named
DiseaseLocation
to theontology_labels.tsv
table containing disease locations associated with each ontology term. - Add columns
DocFile
,DataFile
andDatePublished
to thenhanes_tables.tsv
table, containing the URLs of the documentation and data files (respectively), and the date when the table was published or updated.
Internal changes:
- Refactor to use text2term v3.0.x which includes unmapped terms in the mappings tables and other improvements.
- Better support for handling potentially multiple mappings per trait.
v3.4.0
Change Log:
- Update ontology table generator to use (and print) corrected SemSQL ontology download URLs.
- Fixes a 404 error observed in some systems.
- Add functionality to indicate when variables are (or are not) ontology-mapped.
- Adds a new column called
OntologyMapped
to the tablenhanes_variables.tsv
.
- Adds a new column called
- Updated GitHub Action to post an issue in NHANES repo when a new metadata-tools release is available.
- Add parameter to specify whether
generate_ontology_tables.py
outputs a single table for all ontologies, or one table per ontology. For example, to choose whether the labels of all ontologies should be in a single table, or in separate tables on per ontology basis.
v3.3.1
Change Log:
- Converted .Rmd to .R file to simplify execution using Docker.
v3.3.0
Change Log:
- Include Use Constraints of both NHANES tables and variables.
- Include BeginYear and EndYear of surveys, just as obtained from nhanesA.
- Added a log with any errors and warnings that occur during the metadata download process.
- Added a table containing the NHANES table-variable pairs for which nhanesA could not fetch codebooks for.
- Automatically post issue in NHANES repository when a new metadata set is available.
- Use the latest stable version of FoodOn ontology (v0.6.0).
v3.2.0
v3.1.0
Change Log:
- Modified the
metadata/nhanes_variables.tsv
table to have one row for each target, such that a variable likeAUQ020D
in tableAUX_D
, which has two targets, will appear in two rows in the table—each row with a unique target (and the same variable details).
v3.0.0
Change Log:
- Updated to a metadata set collected on June 6, 2023 from CDC (using nhanesA).
- Switched to using exclusively TSV as output format for metadata, ontology tables, etc.
- Switched to using CamelCase naming convention for all table headers—e.g.,
Table Name
is now ->TableName
. - Properly distinguish between direct and inherited mappings, such that the total mapping count is the sum of direct and inherited.
- Ontology mapping scores are rounded to 3 decimal points.
v2.2.0
Change Log:
- Added ontology/database cross-references table.
- Drop duplicates across all tables.
- Exclude deprecated ontology terms from the tables.
- Fix issue in composite ID expansion (i.e., Table+Variable) and in function to compute the top mappings.
- Updated tables to have columns named using CamelCase convention.
- Added module to generate counts of direct and indirect/inferred mappings to each ontology term.
- Added counts of mappings to the
ontology_labels
table. - Fix issue with conversion of term CURIEs to IRIs — OBO identifiers had to prioritized.
- Updated mappings to those obtained with text2term v2.3.1
v2.1.0
Change Log:
- Modified ontology-table generator (code/generate_semql_ontology_tables.py) to output a single table containing all ontologies' relationships, rather than having one table per ontology. This helps querying as one need not instruct what specific ontology to search on. The modified tables contain a column to specify the source ontology.
- Switched to using TSV format for the ontology-mappings tables and the SemanticSQL ontology-tables.
- Modified column names in the TSV files to have no spaces.
- Included mappings to HPO, MONDO and Orphanet ontology terms that are used in EFO.
- Updated the base IRIs restriction in code/resources/ontologies.csv and its handling such that multiple IRIs can be specified.
v2.0.0
Change Log:
- Added mappings of variable and table labels to ontology terms in EFO, NCIt, and FoodOn ontologies. The mappings are stored in the /ontology-mappings folder.
- Added ontology tables to facilitate ontology-based search. These tables are extracted from SemanticSQL SQLite distributions of biomedical ontologies. The tables are stored in the /ontology-tables folder.
- Improved organization of repository: better structure, documentation and READMEs.