Skip to content

Commit

Permalink
[SCHEMA] Add metadata term files (#774)
Browse files Browse the repository at this point in the history
* [SCHEMA] Add metadata term files (#762)

* Draft a handful of metadata term files.

* Add example with specific possible values.

* Match the validator schemas better.

* Fix formatting.

* Fix formatting again!

* Draft semi-functional rendering functions.

* Use unit abbreviations.

* Add more fields.

* Get macro working.

* Add terms from first table.

* Add AnatomicalLandmarkCoordinateSystem.

* fMRI task information table.

* More terms.

* More terms.

* More terms.

* EchoTime and FlipAngle

* Add tables.

* More tables.

* More terms.

* More terms.

* More terms.

* Fix spacing.

* Clean things up.

* More terms.

* More terms!

* More terms.

* Some iEEG terms.

* More iEEG terms.

* Add ASL labeling terms.

* Next batch.

* More terms.

* More terms.

* More terms.

* Fix mistakes.

* Last terms.

* Reference yamls.

* Fix mistakes.

* Change format of coordinate system files.

* Use degree in associated files.

* Some of the missing terms.

* A few more terms.

* Fix typos.

* More terms.

* More terms.

* More terms.

* More terms.

* More terms.

* More terms.

* Last terms.

* Fix link.

* Fix internal links.

* Fix links for real.

* Derivative terms.

* Fix up code link.

* Use backslashes for continued strings.

* Replace $ref with file contents.

Also support plural datatype strings and all manner of newlines in descriptions.

* Fix genetics.

* Describe the structure of metadata YAML files.

* Make metadatatype function recursive.

* Improve search function.

* Start adding PET fields.

* Add some fields.

* More terms.

* More terms.

* More terms.

* Fix mistakes.

* More terms.

* Replace InstitutionDepartmentName with existing InstitutionalDepartmentName.

* More terms.

* More terms.

* More terms.

* More terms.

* More terms.

* More terms.

* More terms.

* Last terms.

* Add unit format for strings.

Unused for now but could be useful later.

* Add dataset_relative and participant_relative string formats.

* Update READMEs.

* Fix formats in README.

* Support table-specific metadata description extensions.

* Employ description extensions with IntendedFor.

* Remove explicit defaults from YAML files.

* Replace Minimum with minimum.

* Replace inclusiveMaximum with maximum.

* Replace implicit links with explicit ones.

* Rename key_name to name.

* Rename "Unit" to "Units"

* Improve make_metadata_table docstring.

* Start addressing inconsistencies between rendered and hardcoded tables.

* Fix typos in PET metadata

From #786.

* Add metadata fields from qMRI appendix.

* Fix.

* Address duplicate datatypes.

Should address the "string or string or string or string" issue.

* Wrap example strings in code.

* Use enum for n/a instead of pattern.

It's easier to identify as a special case.

* Replace "string" with "n/a" when appropriate.

* Address some inconsistencies.

* Take a crack as SpatialReference.

The type is complicated, but I _think_ I've got it figured out.

* Apply suggestions from code review

Thanks @effigies!

Co-authored-by: Chris Markiewicz <[email protected]>

* Update tools/schemacode/schema.py

* search_structure --> dereference_yaml

* Use faster loading approach.

* Fix deprecation link.

* Apply suggestions from code review

Co-authored-by: Chris Markiewicz <[email protected]>

* Update 01-magnetic-resonance-imaging-data.md

* Add B0FieldIdentifier and B0FieldSource.

* Revert type changes and add TODOs to check them.

* Update tools/schemacode/schema.py

* Replace remaining relative links.

* Apply suggestions from code review

Co-authored-by: Chris Markiewicz <[email protected]>

* Add new coordinate systems from #775.

* Grab hack from #781 (which wasn't merged).

* Create HED.yaml

* boldify table headers

* add device info metadata

* fix table fences

* fix cell padding

* fix cell padding

this is getting old VERY quickly

* Fix up DICOM tags in metadata.

* Leverage "name" field for section-specific metadata definitions.

* composite instances --> measurements

Changes from #813.

* Fix name of HED field.

* Fix string formatting in coordinate system fields.

* Move "preferably same as" to section-specific text.

For #774 (comment)

* Standardize DICOM Tag format.

Still need to move the references out of the generic definitions.

* Move mentions of DICOM Tags out of definitions.

Only for fields that *also* appear in modalities that *don't* use DICOM.

* Apply suggestions from code review

Co-authored-by: Stefan Appelhoff <[email protected]>

* Generalize SoftwareFilters example.

* Distinguish AnatomicalLandmarkCoordinates definitions.

* Rename fmapEchoTime to match new format.

* Apply suggestions from code review

Co-authored-by: Stefan Appelhoff <[email protected]>

* Apply suggestions from code review

Co-authored-by: Stefan Appelhoff <[email protected]>

* Address review.

* Fix example manufacturer names.

* TEMPORARY: fix osipi URL (revert when osipi.org is back)

* Fix regex for identifying macros.

* Partially address review.

* Do not assume minItems is 1 for array terms.

* composite instances --> measurements (again)

* Update src/schema/metadata/MagneticFieldStrength.yaml

Co-authored-by: Stefan Appelhoff <[email protected]>

* Update description to PharmaceuticalDoseTime

#774 (comment)

* Update 09-positron-emission-tomography.md

* fix table pipe alignment

* add pharmaceuticaldosetime fix to schema

includes a bugfix to convert "should" (not casing, this was not intended
as a SHOULD) to a MUST.

* fix str examples

* Apply suggestions from code review

Co-authored-by: Stefan Appelhoff <[email protected]>

* Update src/schema/metadata/RepetitionTimeExcitation.yaml

Co-authored-by: Stefan Appelhoff <[email protected]>

* Add DoseCalibrationFactor.

* Update ScanDate definition and deprecate it.

* Remove hardcoded tables.

* Remove unused links.

* Update tools/schemacode/schema.py

Co-authored-by: Chris Markiewicz <[email protected]>

Co-authored-by: Chris Markiewicz <[email protected]>
Co-authored-by: Remi Gau <[email protected]>
Co-authored-by: Stefan Appelhoff <[email protected]>
Co-authored-by: mnoergaard <[email protected]>
  • Loading branch information
5 people authored Jul 13, 2021
1 parent 99e6f48 commit 5de7cfc
Show file tree
Hide file tree
Showing 322 changed files with 3,939 additions and 693 deletions.
1 change: 1 addition & 0 deletions pdf_build_src/process_markdowns.py
Original file line number Diff line number Diff line change
Expand Up @@ -445,6 +445,7 @@ def process_macros(duplicated_src_dir_path):

# Replace code snippets in the text with their outputs
matches = re.findall("({{.*?}})", contents)
matches = re.findall(re.compile("({{.*?}})", re.DOTALL), contents)
for m in matches:
# Remove macro delimiters to get *just* the function call
function_string = m.strip("{} ")
Expand Down
27 changes: 13 additions & 14 deletions src/02-common-principles.md
Original file line number Diff line number Diff line change
Expand Up @@ -517,14 +517,19 @@ Note that if a field name included in the data dictionary matches a column name
then that field MUST contain a description of the corresponding column,
using an object containing the following fields:

| **Key name** | **Requirement level** | **Data type** | **Description** |
| ------------ | --------------------- | --------------------------------------- | --------------------------------------------------------------------------------------------------------------- |
| LongName | OPTIONAL | [string][] | Long (unabbreviated) name of the column. |
| Description | RECOMMENDED | [string][] | Description of the column. |
| Levels | RECOMMENDED | [object][] of [strings][] | For categorical variables: An object of possible values (keys) and their descriptions (values). |
| Units | RECOMMENDED | [string][] | Measurement units. SI units in CMIXF formatting are RECOMMENDED (see [Units](./02-common-principles.md#units)). |
| TermURL | RECOMMENDED | [string][] | URL pointing to a formal definition of this type of data in an ontology available on the web. |
| HED | OPTIONAL | [object][] of [strings][] or [string][] | Hierarchical Event Descriptor (HED) information, see: [Appendix III](./99-appendices/03-hed.md) for details. |
{{ MACROS___make_metadata_table(
{
"LongName": "OPTIONAL",
"Description": (
"RECOMMENDED",
"The description of the column.",
),
"Levels": "RECOMMENDED",
"Units": "RECOMMENDED",
"TermURL": "RECOMMENDED",
"HED": "OPTIONAL",
}
) }}

Please note that while both `Units` and `Levels` are RECOMMENDED, typically only one
of these two fields would be specified for describing a single TSV file column.
Expand Down Expand Up @@ -767,10 +772,4 @@ to suppress warnings or provide interpretations of your file names.

[derived-dataset-description]: 03-modality-agnostic-files.md#derived-dataset-and-pipeline-description

[string]: https://www.w3schools.com/js/js_json_syntax.asp

[strings]: https://www.w3schools.com/js/js_json_syntax.asp

[object]: https://www.json.org/json-en.html

[deprecated]: ./02-common-principles.md#definitions
48 changes: 22 additions & 26 deletions src/03-modality-agnostic-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,20 +14,22 @@ Templates:
The file `dataset_description.json` is a JSON file describing the dataset.
Every dataset MUST include this file with the following fields:

| **Key name** | **Requirement level** | **Data type** | **Description** |
|--------------------|-----------------------|--------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Name | REQUIRED | [string][] | Name of the dataset. |
| BIDSVersion | REQUIRED | [string][] | The version of the BIDS standard that was used. |
| HEDVersion | RECOMMENDED | [string][] | If HED tags are used: The version of the HED schema used to validate HED tags for study. |
| DatasetType | RECOMMENDED | [string][] | The interpretation of the dataset. MUST be one of `"raw"` or `"derivative"`. For backwards compatibility, the default value is `"raw"`. |
| License | RECOMMENDED | [string][] | The license for the dataset. The use of license name abbreviations is RECOMMENDED for specifying a license (see [Appendix II](./99-appendices/02-licenses.md)). The corresponding full license text MAY be specified in an additional `LICENSE` file. |
| Authors | OPTIONAL | [array][] of [strings][] | List of individuals who contributed to the creation/curation of the dataset. |
| Acknowledgements | OPTIONAL | [string][] | Text acknowledging contributions of individuals or institutions beyond those listed in Authors or Funding. |
| HowToAcknowledge | OPTIONAL | [string][] | Text containing instructions on how researchers using this dataset should acknowledge the original authors. This field can also be used to define a publication that should be cited in publications that use the dataset. |
| Funding | OPTIONAL | [array][] of [strings][] | List of sources of funding (grant numbers). |
| EthicsApprovals | OPTIONAL | [array][] of [strings][] | List of ethics committee approvals of the research protocols and/or protocol identifiers. |
| ReferencesAndLinks | OPTIONAL | [array][] of [strings][] | List of references to publications that contain information on the dataset. A reference may be textual or a [URI][uri]. |
| DatasetDOI | OPTIONAL | [string][] | The Digital Object Identifier of the dataset (not the corresponding paper). DOIs SHOULD be expressed as a valid [URI][uri]; bare DOIs such as `10.0.2.3/dfjj.10` are [DEPRECATED][deprecated]. |
{{ MACROS___make_metadata_table(
{
"Name": "REQUIRED",
"BIDSVersion": "REQUIRED",
"HEDVersion": "RECOMMENDED",
"DatasetType": "RECOMMENDED",
"License": "RECOMMENDED",
"Authors": "OPTIONAL",
"Acknowledgements": "OPTIONAL",
"HowToAcknowledge": "OPTIONAL",
"Funding": "OPTIONAL",
"EthicsApprovals": "OPTIONAL",
"ReferencesAndLinks": "OPTIONAL",
"DatasetDOI": "OPTIONAL",
}
) }}

Example:

Expand Down Expand Up @@ -69,10 +71,12 @@ In addition to the keys for raw BIDS datasets,
derived BIDS datasets include the following REQUIRED and RECOMMENDED
`dataset_description.json` keys:

| **Key name** | **Requirement level** | **Data type** | **Description** |
|----------------|-----------------------|--------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| GeneratedBy | REQUIRED | [array][] of [objects][] | Used to specify provenance of the derived dataset. See table below for contents of each object. |
| SourceDatasets | RECOMMENDED | [array][] of [objects][] | Used to specify the locations and relevant attributes of all source datasets. Valid keys in each object include `URL`, `DOI` (see [URI][uri]), and `Version` with [string][] values. |
{{ MACROS___make_metadata_table(
{
"GeneratedBy": "REQUIRED",
"SourceDatasets": "RECOMMENDED",
}
) }}

Each object in the `GeneratedBy` list includes the following REQUIRED, RECOMMENDED
and OPTIONAL keys:
Expand Down Expand Up @@ -394,16 +398,8 @@ code organization of these scripts at the moment.

<!-- Link Definitions -->

[objects]: https://www.json.org/json-en.html

[object]: https://www.json.org/json-en.html

[string]: https://www.w3schools.com/js/js_json_syntax.asp

[strings]: https://www.w3schools.com/js/js_json_syntax.asp

[array]: https://www.w3schools.com/js/js_json_arrays.asp

[uri]: ./02-common-principles.md#uniform-resource-indicator

[deprecated]: ./02-common-principles.md#definitions
Loading

0 comments on commit 5de7cfc

Please sign in to comment.