Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Physio object generation from BIDS #4

Merged
merged 16 commits into from
Aug 21, 2024

Conversation

maestroque
Copy link
Contributor

@maestroque maestroque commented Jul 11, 2024

Closes #3

Proposed Changes

  • Add a function to load BIDS structures and automatically generate Physio objects
  • Add an MRI data object to store trigger parameters, universal to all generated Physio objects

Change Type

  • bugfix (+0.0.1)
  • minor (+0.1.0)
  • major (+1.0.0)
  • refactoring (no version update)
  • test (no version update)
  • infrastructure (no version update)
  • documentation (no version update)
  • other

Checklist before review

  • I added everything I wanted to add to this PR.
  • [Code or tests only] I wrote/updated the necessary docstrings.
  • [Code or tests only] I ran and passed tests locally.
  • [Documentation only] I built the docs locally.
  • My contribution is harmonious with the rest of the code: I'm not introducing repetitions.
  • My code respects the adopted style, especially linting conventions.
  • The title of this PR is explanatory on its own, enough to be understood as part of a changelog.
  • I added or indicated the right labels.
  • I added information regarding the timeline of completion for this PR.
  • Please, comment on my PR while it's a draft and give me feedback on the development!

@m-miedema
Copy link
Member

See my comment on the corresponding issue, I think we'll need to add more flexibility to column naming for this to be useful.

@maestroque
Copy link
Contributor Author

@m-miedema it's fixed for the time being. I have implemented string literals searching, to give the physio_type property that is used in some functions.

@m-miedema
Copy link
Member

This looks reasonable to me! Let me know when it's ready for review :)

@me-pic
Copy link
Contributor

me-pic commented Jul 22, 2024

Hi @maestroque ! For the addition of unit tests for those new functions, I've talked with a colleague and she recommended to just generate the structure on the fly when we run pytest (and generate random number for the timeseries if we even need to), if we are just concerned about the data structure ! If we are more interested about the content of those files and need actual real data you can take a look at that BIDS app

@maestroque
Copy link
Contributor Author

Thanks @me-pic ! Sounds reasonable to me

Copy link
Contributor

@me-pic me-pic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maestroque Nice work on that PR !

physutils/io.py Outdated

for col in columns:
col_physio_type = None
if any([x in col for x in ["cardiac", "ppg", "ecg", "card"]]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maestroque Would it be possible to change for col.lower() ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

from loguru import logger

from physutils import physio

EXPECTED = ["data", "fs", "history", "metadata"]


def load_from_bids(
Copy link
Contributor

@me-pic me-pic Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to add the parameter recording=None ? The reason is that according to the current version of the BIDS spec physio data with different sampling rate need to be saved under different file names using the recording entity to differentiate them apart.

Copy link
Contributor

@me-pic me-pic Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And recording should be added when you call layout.get

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, done!

physutils/io.py Outdated
if not op.exists(bids_path):
raise FileNotFoundError(f"Provided path {bids_path} does not exist")

layout = BIDSLayout(bids_path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if there is a reason not to do that but I think we should specify layout = BIDSLayout(bids_path, validate=False). I tested the current function with a BIDS dataset containing multiple physio files saved under names using different label values with the recording entity (ex. recording-pulse and recording-breathing ), and if validate=False is not specified, it just gives me an empty list after layout.get()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change + the addition of the recording parameter would allow people with BIDS dataset containing physio data with different sampling rate to still be able to use those functionalities

@maestroque maestroque changed the title [WIP] Add Physio object generation from BIDS Add Physio object generation from BIDS Jul 23, 2024
@maestroque
Copy link
Contributor Author

@me-pic the unit test is ready. I'll address your other comments shortly and it should be good

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we should also change the matplotlib >=3.9 line to non-constraining?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good catch !

Copy link
Contributor

@me-pic me-pic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @maestroque for integrating previous comments ! I've added some more 😅 Should be pretty straight forward to integrate, but let me know if you need help !

physutils/io.py Outdated

for col in columns:
col_physio_type = None
if any([x in col.lower() for x in ["cardiac", "ppg", "ecg", "card"]]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@m-miedema Do you think it would be reasonable to add "pulse" as a valid column name for "cardiac" ? And also "o2" and "co2" for "respiratory" ?

Copy link
Member

@m-miedema m-miedema Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although O2 and CO2 reflect respiration, I don't think they should be labelled as respiratory! This is mainly because the respiratory response function should not be used for these types of data (see e.g. the fMRI literature on end-tidal response functions), and users may become confused with putting this data in the same category as a belt measurement.

For pulse, I've never personally seen it used as a naming convention, but why not!

run_id = "01"

bids_dir = pjoin(
data_dir, "bids-dir", f"sub-{subject_id}", f"ses-{session_id}", "func"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make sure that the test covers the possibility to have the recording entity mentioned in the filename. Ex. sub-006_ses-03_task-test_run-01_recording-cardiac_physio.tsv.gz would be a valid filename according to the BIDS standard. And since we also included a recording parameter in the load_from_bids function

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @maestroque I think my comment was not clear. The tests should cover the possibility to have filenames with the recording entity as well as filenames that does not include that entity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good catch !

label = ref_physio.label if copy_label else None
physio_type = ref_physio.physio_type if copy_physio_type else None
computed_metrics = list(ref_physio.computed_metrics) if copy_computed_metrics else []
computed_metrics = (
dict(ref_physio.computed_metrics) if copy_computed_metrics else []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be dict(ref_physio.computed_metrics) if copy_computed_metrics else {} instead ?

@@ -542,3 +544,55 @@ def neurokit2phys(
metadata = dict(peaks=peaks)

return cls(data, fs=fs, metadata=metadata, **kwargs)


class MRIConfig:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I missed it, but is there any test to cover that class ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is not used yet, because it would need detection of the trigger channel in order to initialize it. I can open an issue defining the next steps for it, and I'll implement it if there is time after the workflow

# The data saved are the ones after t_0 = 0s
assert phys_array[col].data.size == 70000
assert phys_array[col].fs == 10000.0
assert phys_array[col].history[0][0] == "physutils.io.load_from_bids"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to delete the bids dataset at the end of the function or just to add the bids-dir directory in the .gitignore ?

physutils/io.py Outdated

config_file = bids_file[0].get_metadata()
fs = config_file["SamplingFrequency"]
t_start = config_file["StartTime"] # noqa
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to define the start time here? It seems like it could be integrated into the time column checks, although its not necessary to do that in this PR. Perhaps open an issue to do a QA check and/or integrate the info into the specification of idx_0 in the event of a missing time column.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I integrated it into the idx_0 definition

Copy link
Member

@m-miedema m-miedema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @maestroque ! This looks very nice and well-organized so far, I left a few comments to add to what @me-pic has included in her review. The main thing I want to understand better is the relationship between the MRIcfg and Physio objects, and whether it's relevant to extract any information from the trigger column itself (rather than the sidecar file). I don't think there's much in the MRI class that can be generated from the trigger column, other than the TR and number of scans, so while you can confirm this is correct from the sidecar, I don't think it's necessary to do so. I'm also wondering if you intend to e.g point the Physio object to its corresponding MRIcfg object in some way while you're setting them up - would this be helpful in metric calls later on?

@maestroque
Copy link
Contributor Author

@me-pic I integrated your comments, so it should be good from that side!

@m-miedema I also addressed your comments on StartTime and n_slices. Regarding the use of the MRIConfig class and Physio, what you say is true, that was the intended use of the new class as we discussed in the past. It is pending the extraction of data from the trigger channel and the integration into the Physio object. I will open an issue about this, but I will tackle it if time allows after the workflow

@maestroque
Copy link
Contributor Author

Other than that, it should be gtg

Copy link
Contributor

@me-pic me-pic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @maestroque for integrating the changes ! I've added to more comments, but once those are addressed I think we could merge 🚀

run_id = "01"

bids_dir = pjoin(
data_dir, "bids-dir", f"sub-{subject_id}", f"ses-{session_id}", "func"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @maestroque I think my comment was not clear. The tests should cover the possibility to have filenames with the recording entity as well as filenames that does not include that entity.

df.to_csv(tsv_file, sep="\t", index=False, header=False, float_format="%.8e")

# Compress tsv file into tsv.gz
tsv_gz_file = pjoin(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand why the file is first saved as a tsp, then loaded and saved again as a compressed file. I think you can just directly save df as a tsv.gz since that is the required format according to the BIDS spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right, it was left as such from debugging

@maestroque
Copy link
Contributor Author

@me-pic I think we are now ready, let me know if you have anything else in mind

Copy link
Contributor

@me-pic me-pic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎉

@m-miedema
Copy link
Member

Great work, thanks @maestroque !

@m-miedema m-miedema merged commit e0562d4 into physiopy:master Aug 21, 2024
1 check passed
@smoia
Copy link
Member

smoia commented Aug 25, 2024

🚀 PR was released in 0.2.1 🚀

@smoia smoia added the released label Aug 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Physio object generation from BIDS structures
4 participants