-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Cannot use parallel for checks #436
Comments
Quite strange - we have tests for parallel usage in the API (https://github.com/NeurodataWithoutBorders/nwbinspector/blob/dev/tests/test_inspector.py#L212); and technically those leverage the same configuration wrapper via Tried adding a new API test for this call in #437, and I suppose I should add a CLI invocation and report saving just for good measure Are you able by change to reproduce this in any way minimally? Would you be willing to share the NWB files in question or if on DANDI, point me towards the dandiset? Also can you share the results of the non-parallel report? (Wondering if it might be some specific check that in particular causes this, and that check is not in one of the base files we use for testing) |
Interesting; the CLI invocation reproduces it! https://github.com/NeurodataWithoutBorders/nwbinspector/actions/runs/7910884792/job/21594182388?pr=437 But the API usage doesn't... hmm Will dig into this |
Though even more curiously seems to only affect linux and mac 😅 |
I tried to reproduce with simple test nwb files. Apparently the import numpy as np
from datetime import datetime
from dateutil.tz import tzlocal
from pynwb import NWBFile, TimeSeries, NWBHDF5IO
num_files = 10
for i in range(num_files):
start_time = datetime(2024, 1, 2, 3, 4, 5, tzinfo=tzlocal())
name = f"test_{i:03d}"
nwbfile = NWBFile(
session_description=name,
identifier=name,
session_start_time=start_time,
)
ts = TimeSeries(
name=f"ts{i}",
data=np.random.randn(10, 200),
unit="-",
timestamps=np.arange(10),
)
nwbfile.add_acquisition(ts)
with NWBHDF5IO(name + '.nwb', "w") as io:
io.write(nwbfile) Running with this is fine: nwbinspector . --n-jobs 4 But this would throw the pretty much the same error: nwbinspector . --config dandi --n-jobs 4 Browsing the code, my guess is it has to do something with how the functions in |
Apparently, not using If I change this line: nwbinspector/src/nwbinspector/nwbinspector.py Line 160 in 3b91993
to just mapped_check = check things would work fine. |
But is the full testing suite OK with that change? |
BTW I'm having trouble reproducing this locally on my Windows and Linux systems, hence a bit of delay on getting to this |
I haven't run the tests yet. I'll try to run them later today or tomorrow on my local machine. |
I tried running the tests on your PR branch locally on my machine (Linux) without Testing on that file would have the
But then re-doing with only the failed tests seem to pass.
Do you know why? |
Yeah, that's the reason the copy function exists in the first place. It's kind of messy to go into detail Will try to think of broader workaround. For now I recommend running either without parallel and with config (slowest but most accurate), or without config and with parallel (recommended for very large number of files) Note that the DANDI config just elevates certain subject related items to a high enough level to prevent validation - so you could theoretically just cross-reference the output against this file |
Gotcha. Thanks for the suggestions! |
What happened?
I've been having issues with running the inspector in parallel.
Doing this would be fine
But using
--n-jobs
would tend to end in errorThe traceback is below:
Operating System
Linux (not sure the exact OS, running on university cluster).
Python Version
Python 3.11.4
Were you streaming with ROS3?
No
Package Versions
Code of Conduct
The text was updated successfully, but these errors were encountered: