Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inheritance: "false positive" conflict among metadata files if some entities are missing #1182

Open
yarikoptic opened this issue Aug 3, 2022 · 3 comments

Comments

@yarikoptic
Copy link
Collaborator

Originally brought up in #102 (comment) and discussed briefly in the https://github.com/bids-standard/bids-specification/pull/946/files#r771841807 with @Lestropie.

The Inheritance Principle (IP) has

2. For a given data file, any metadata file is applicable to that data file if:
...
c.The metadata filename does not include any entity absent from the data filename.
...
4. There MUST NOT be multiple metadata files applicable to a data file at one level of the directory hierarchy.

from which for IMHO a legit (didn't check with bids-validator though) case of

├─ sub-01/
│  └─ func/
│     ├─ sub-01_task-rest_acq-tricky_bold.nii.gz 
│     ├─ sub-01_task-rest_acq-tricky_bold.json
│     ├─ sub-01_task-rest_bold.nii.gz 
│     ├─ sub-01_task-rest_bold.json

we would get a violation of the rule 4 if we follow 2.c
since both sub-01_task-rest_bold.json and sub-01_task-rest_acq-tricky_bold.json would be applicable to sub-01_task-rest_acq-tricky_bold.nii.gz .

I see following possible Fixes:

  • F1: do declare such cases/datasets "non-legit" due to inheritance principle (bids-validator should catch that then)
  • adjust inheritance principle to make such conflicts avoidable. Possible way I see:
    • F2: Add clarification to 4 (e.g. 4.a) that "- metadata files at the level of the data file and matching data file exactly in all entities and suffix, but only different in extension, are not to be considered as metadata files for other data files" . I.e in the example above sub-01_task-rest_bold.json is not considered for sub-01_task-rest_acq-tricky_bold.nii.gz since there is sub-01_task-rest_bold.nii.gz.
      • cons: implementation might be tricky since we do not per se have "data suffix" since there could be multitude. So the simplest implementation would be "test if there is ANY other file without such (.json) extension". Then there would be another separate validation check if that other file is part of BIDS according to the schema.
  • Contribute more!
@Lestropie
Copy link
Collaborator

With #1003, sub-01_task-rest_acq-tricky_bold.nii.gz would inherit sub-01_task-rest_bold.json first, and sub-01_task-rest_acq-tricky_bold.json second. So the second could override contents of the first and/or add new fields.

Generally in a situation like this it would be recommended that both images should have the acq entity in order to disambiguate fully. But with #1003 I'm shooting for an objective implementation rather than guidelines. Combined, the solution would look something like:

├─ sub-01/
│  └─ func/
│     ├─ sub-01_task-rest_acq-nontricky_bold.nii.gz 
│     ├─ sub-01_task-rest_acq-nontricky_bold.json
│     ├─ sub-01_task-rest_acq-tricky_bold.nii.gz 
│     ├─ sub-01_task-rest_acq-tricky_bold.json
│     ├─ sub-01_task-rest_bold.json

So sub-01_task-rest_bold.json contains information that's common to resting-state acquisitions in this subject, and *_acq-*_bold.json contain acquisition-specific information. As you say, this is explicitly illegal according to the current IP; #1003 is designed to make it legal.

@yarikoptic
Copy link
Collaborator Author

Thank you @Lestropie , I agree that #1003 would address this issue and make it legit. Thinking further, I am not even sure if validator should even warn about such cases since I might want to use them on purpose in cases where e.g. _acq-tricky needs to change/add only few metadata fields on top of the .json without it. But we might want to add some generic logic to validation of inheritance principle like "WARNING: later in inheritance principle file overloads with different values significant (>5 && >20%) of metadata fields" to prevent cases where inheritance principle places largely unrelated metadata files into its "sequence".

@Lestropie
Copy link
Collaborator

Personally I certainly wouldn't want to be issuing a warning due to inheritance from multiple files in a single directory. I'm proposing it specifically because I need to use it, so a warning would be counter-productive.

A validator warning about overwrites in the process of inheritance would be reasonable I think. For me, I would be avoiding overwrites, I merely want to be able to store sidecar information at the appropriate point in the hierarchy. Overwriting is ill-advised from my own perspective, but is a possibility and therefore needs to be described robustly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants