Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Successful DANDI upload with multiple jobs spawns hanging processes #540

Closed
CodyCBakerPhD opened this issue Dec 13, 2023 · 2 comments

Comments

@CodyCBakerPhD
Copy link
Collaborator

Another odd but reproducible issue on certain systems is that after a successful upload to the archive using the NWB GUIDE App, there are several rogue Python processes spawned in the background that take about ~5 minutes to timeout

In those 5 minutes, any attempt to close and relaunch the application will stall out. After the 5 minutes are up, the app can be used again without issue

The actual underlying issue remains a mystery (possibly swapping away from joblib as per previous discussions with Yarik may magically fix the problem?) but in the meantime there are two paths of action to resolve

(a) just wait 5 minutes; as long as this issue is easy to find when/if any user encounters the error, the problem will eventually fix itself

(b) forcing single job usage (not the current default as of 12/13/2023) also fixed the issue by not spawning the bad processes to begin with; @garrettmflynn suggested we patch this in as the new default just to be safe, which will also help to avoid those previous seg fault errors sporadically observed by users

@CodyCBakerPhD
Copy link
Collaborator Author

Action item here is to just swap DANDI code to not use joblib

@CodyCBakerPhD
Copy link
Collaborator Author

Will keep this in mind, but the current solution isn't too bad. Depending on system and internet, it's possible that a single job can move as quickly as multiple if the bandwidth is the bottleneck

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant