-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Fix epoch splits naming #11876
BUG: Fix epoch splits naming #11876
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me know when it's good to go from your end. Don't forget to update latest.inc file in the doc. thx
While working on the PR, found another problem: if the ending is not BIDS, i.e. Also, should |
mne/tests/test_epochs.py
Outdated
with pytest.raises(FileExistsError, match="Destination file"): | ||
epochs.save(split_fname, split_naming=split_naming, verbose=True) | ||
os.remove(split_fname) | ||
# we don't test for reserved files as it's not implemented here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what reserved files
means here. I noticed, Raw.save()
mentions something reserving the main file when producing splits. Is it something important I should be aware of? For now I just removed this comment.
mne/tests/test_epochs.py
Outdated
events = mne.make_fixed_length_events(raw, 1) | ||
epochs = mne.Epochs(raw, events) | ||
if split_size == "2MB" and (metadata or concat): | ||
n_files += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what this was doing. Removed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a special case inside the test. Basically it was necessary because in the case where your split_size
is small and you have metadata (or concat, for whatever reason), you will end up with another file. It doesn't seem like your PR should change the file sizes produced but rather just the filenames, so it seems like a bug (either on this PR or in main
currently) that you were able to remove this and have things still pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... looks like it is a bug on main
because we never use split_size="2MB"
, so removing this seems okay
mne/tests/test_epochs.py
Outdated
@@ -1511,50 +1511,102 @@ def test_split_saving(tmp_path, split_size, n_epochs, n_files, size, metadata, c | |||
} | |||
) | |||
epochs.metadata = metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see, how the metadata are used in the tests. Do metadata affect epochs.drop_bad()
? I don't think they do. Same thing with concat
. Maybe they should be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They were probably added to make sure that we actually split the metadata properly (or copy it? not sure which we do currently) when splitting the epoch data across multiple files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've looked into commit history and it seems they are added to fix #7897. My current understanding is that there was a bug with saving splits when the size just marginally exceeded the splitting threshold, so metadata and concat manipulations help catch this corner case. Does this make sense?
Also there's #5102 related to problem with events when loading back the splits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there was a bug with saving splits when the size just marginally exceeded the splitting threshold
ah yes I remember that one. Nice git archeology!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx:)
@sappelhoff WDYT the BIDS-sensible thing is to do in this case? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmalt thanks for working on this!
As it stands, this PR is a little bit tough to review because you're mixing adding functionality with refactoring non-trivial tests that cover weird corner cases. It is hard to see what has been changed to make tests cleaner/better while keeping all existing functionalityl/checks (this is difficult to see) vs what was changed to accommodate the new naming scheme. It makes me worry that we might silently have break/omit some test that we put in place previously to cover some corner case. So even though the tests look cleaner, in some sense they are a bit less safe now.
I would suggest reverting the test changes, then start by adding just a new parametrization like:
@pytest.mark.parametrize("split_naming", ("neuromag", "bids"))
on a test or a few tests. Then you'll need to make a few small naming updates in the tests. This would be digestible from a review standpoint.
Alternatively you could open a fresh PR that only does the test refactoring without adding anything new. This could include stuff like the removal of the if split_size == "2MB"
that probably doesn't need to be there. Then we get that green and merged, then rebase this PR on that... and then this PR once again hopefully just adds a parametrize
+ naming updates as above.
Does that make sense?
Yes, you're right, the PR spiralled out of control real quick:) Let me try from scratch. I feel like refactoring tests first would be a better option. The library code changes are easy but testing them without bloating an existing test even more -- not as much. |
IMHO mne-bids is the BIDS-aware software and the feature that is provided via MNE-Python here is more a convenience for mne-bids. So I don't have a strong opinion as we correctly use this in mne-bids. I wouldn't get into too much of a logic branching here and simply assume that users will either:
|
997074b
to
651517c
Compare
I was thinking adding a simple check |
As a matter of fact, this check is already implemented. It was just turned off. The filenames for splits are constructed via |
mne/epochs.py
Outdated
fname = f"{base}-{part_idx:d}{ext}" | ||
elif split_naming == "bids" and n_parts > 1: | ||
fname = _construct_bids_filename(base, ext, part_idx + 1) | ||
_check_fname(fname, overwrite=overwrite) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call _check_fname()
for both 'neuromag' and 'bids' to fix overwriting bug when
epochs.save('test-epo.fif', overwrite=False)
would overwrite existing 'test-epo-1.fif'
mne/epochs.py
Outdated
if part_idx > 0 and split_naming == "neuromag": | ||
fname = f"{base}-{part_idx:d}{ext}" | ||
elif split_naming == "bids" and n_parts > 1: | ||
fname = _construct_bids_filename(base, ext, part_idx + 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove validate=False
to prohibit using '-epo.fif' with split_naming='bids'
This is ready from my side. |
0097cfb
to
91b4a04
Compare
Use already present validation mechanism. For now validation only works when splits will be created, i.e. test-epo.fif with split_naming="bids" is still allowed. Record this behaviour in tests.
[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much easier
a5fb3e2
to
5a9ddeb
Compare
- add xfail reasons - add match regex to pytest.raises - remove elif, add _check_option
for more information, see https://pre-commit.ci
@larsoner our "check rendered docs here" CI isn't working? takes me to the same link as the CircleCI Details link, and there is no artifact to view |
It also doesn't run |
(By "sign up" I just meant to "log in with GitHub" on the UI when you click "Details", so if you did more than that it would be good to know!) |
Co-authored-by: Daniel McCloy <[email protected]>
OK well I guess no harm in re-running them with the changelog commit then. I'll commit that and then mark for merge when green |
ha, @larsoner beats me to it again :) |
* upstream/main: BUG: Fix epoch splits naming (mne-tools#11876)
Honestly, no idea. Sorry :( All I did is logging in with GitHub. It's my first experience with Circle CI, so I might have clicked something by accident. |
Sounds like what you did should have worked, weird! |
And I still have this problem, right? For my next PR I mean. |
Yes probably... maybe there is some way you can disable CircleCI building on your fork, you can try messing with settings in https://app.circleci.com/pipelines/github/dmalt/mne-python |
Ok, I think the problem was that I clicked "Set Up Project" next to MNE-Python right after logging in. To be fair, the button was right in the middle of a screen, so it was calling for action:) I deleted the project and created a test PR #11897 and now it seems to be working fine. |
* upstream/main: [pre-commit.ci] pre-commit autoupdate (mne-tools#11911) [BUG, MRG] Remove check on `mne.viz.Brain.add_volume_labels` (mne-tools#11889) Small splits fix (mne-tools#11905) adds niseq package to "Related software" (mne-tools#11909) Minor fixes for ERDS maps example (mne-tools#11904) FIX: Fix pyvista rendering (mne-tools#11896) BUG: Fix epoch splits naming (mne-tools#11876) ENH: Use section-title for HTML anchors in Report (mne-tools#11890) CI: Deploy [circle deploy] MAINT: Clean up whats_new and doc versions (mne-tools#11888) Refactor test_epochs.py::test_split_saving (2 out of 2) (mne-tools#11884) Cross-figure event passing system (mne-tools#11685) MAINT: Post-release deprecations, updates [circle deploy] (mne-tools#11887) MAINT: Release 1.5.0 (mne-tools#11886) [pre-commit.ci] pre-commit autoupdate (mne-tools#11883) Refactor test_epochs.py::test_split_saving (1 out of 2) (mne-tools#11880) FIX: Missing Saccade information in Eyelink File (mne-tools#11877) Improve drawing of annotations with matplotlib (mne-tools#11855) MAINT: Work around NumPy deprecation (mne-tools#11878)
Co-authored-by: Eric Larson <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Daniel McCloy <[email protected]>
Fixes #11870
Fixes overwriting bug, when
epoch.save('test-epo.fif', overwrite=False, split_naming="neuromag")
would overwrite"test-epo-1.fif"
.Disallows filenames like
"test-epo.fif"
withsplit_naming="bids"
to avoid surprises with split names. File names like"a_b-epo.fif"
, i.e. having several bids clauses divided by "_" are still allowed. Also we don't do this check if no splits will be created.