-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrent h5py #3368
Concurrent h5py #3368
Conversation
1a538b1
to
45b3c6b
Compare
28f5598
to
68c8568
Compare
68c8568
to
efbc166
Compare
efbc166
to
835b86b
Compare
Nice to have the generic retry stuff apart from the HDF5 utility! I hope that one day we won't need this any more, but in the (probably very long) meantime it makes it work. A few minor "cosmetic" suggestions (up to you):
|
Thanks for the review!
Exactly, we make it work like this and work on a better solution afterwards.
Ok, except
Ok.
OK
I added the prefix because you can override the timeout (and also the others) in each individual call while the argument names without the "retry_" prefix are rather generic and have higher probability of colliding with the method's named arguments. So for example: @retry(retry_timeout=10)
def method(..., timeout=None):
# "timeout" has nothing to do with retrying
# I want to override 10 by 1 while the method already has
# a timeout argument for some other purpose
method(retry_timeout=1, timeout=5) |
OK. That's the |
835b86b
to
ed9a627
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pretty convenient to use (regarding the context :) and nice test suite)
I was wondering if it would be beneficial to have a redefinition of silx.io.utils.get_data(url, timeout)
function in silx.io.h5py_utils
.
This can be done in another PR by someone else.
But from my point of view it could be convenient to have
something simple like:
@retry()
def get_data(url):
return silx_get_data(url)
or something more advance like:
get_data(url, retry_timeout, retry_invalid...)
and call open_item if this is an hdf5 file otherwise call the original get_data. But maybe this is overkill.
I'm not quite sure a method in Lets keep that for another PR. |
Closes #3354
Tested with h5py 2.8, 2.9, 2.10, 3.1
Stress tests: https://gitlab.esrf.fr/denolf/concurrent_h5py
The main API:
silx.io.h5py_utils.File
: replaceh5py.File
mainly to handle file lockingsilx.io.h5py_utils.retry
: retry the method whenever you get an HDF5 IO exceptionsilx.io.h5py_utils.retry_in_subprocess
: retry in a subprocess to capture segfaultssilx.io.h5py_utils.retry_contextmanager
: retry entering the context managerThe retry logic can also be used for other things than HDF5 reading:
silx.utils.retry.retry
silx.utils.retry.retry_in_subprocess
silx.utils.retry.retry_contextmanager
Common use case of
silx.io.h5py_utils
for processing a Bliss Nexus file: