-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ BUGFIX ] Fabrics connect timeouts #1711
Conversation
bors try |
tryBuild failed: |
182d2f7
to
e0f2b9a
Compare
Allows passing this via env NVMF_FABRICS_CONNECT_TIMEOUT. Also defaults it to 1s for now, rather than 500ms. Signed-off-by: Tiago Castro <[email protected]>
Otherwise a returned negative value translates into an unknown Errno. Signed-off-by: Tiago Castro <[email protected]>
Initially this block_on was added because the remove callback was running in blocking fashion, but this has since changed and unplug is actually called from async context. As such, we don't need the block_on and simply call the async code directly. Also, simplify complete notification, as we can simply close the sender. Signed-off-by: Tiago Castro <[email protected]>
e0f2b9a
to
8893313
Compare
bors try |
tryBuild succeeded: |
Do we know the reason why historically these paths were using blocking on futures to complete? |
Split creating from starting the subsystem. This way we can start the subsystem in master reactor, and then move to the next spdk subsystem. Signed-off-by: Tiago Castro <[email protected]>
the unplug code was originally not async, and suspect that's why block_on was used. The others I don't know, perhaps was just used as a "shotcut"... |
8893313
to
d5b5d44
Compare
Resolves #1710 |
1711: [ BUGFIX ] Fabrics connect timeouts r=tiagolobocastro a=tiagolobocastro Reactor block_on may prevent spdk thread messages from running and therefore this can lead to starvation of messages pulled from the thread ring, which are not polled during the block_on. There are still a few uses remaining, most during init setup, so mostly harmless, though the Nexus Bdev destruction which runs on blocking code, does still contain a block_on. --- fix(nvmf/target): remove usage of block_on Split creating from starting the subsystem. This way we can start the subsystem in master reactor, and then move to the next spdk subsystem. Signed-off-by: Tiago Castro <[email protected]> --- fix(nexus-child/unplug): remove usage of block_on Initially this block_on was added because the remove callback was running in blocking fashion, but this has since changed and unplug is actually called from async context. As such, we don't need the block_on and simply call the async code directly. Also, simplify complete notification, as we can simply close the sender. Signed-off-by: Tiago Castro <[email protected]> --- fix(nvmx/qpair): return errno with absolute value Otherwise a returned negative value translates into an unknown Errno. Signed-off-by: Tiago Castro <[email protected]> --- feat: allow custom fabrics connect timeout Allows passing this via env NVMF_FABRICS_CONNECT_TIMEOUT. Also defaults it to 3s for now, rather than 500ms. Signed-off-by: Tiago Castro <[email protected]> Co-authored-by: Tiago Castro <[email protected]>
oops typo |
Canceled. |
Should this become an unsafe function? Signed-off-by: Tiago Castro <[email protected]>
5c36253
to
d9e19b3
Compare
bors merge |
Build succeeded: |
Reactor block_on may prevent spdk thread messages from running and therefore this can lead to starvation of messages pulled from the thread ring, which are not polled during the block_on.
There are still a few uses remaining, most during init setup, so mostly harmless, though the Nexus Bdev destruction which runs on blocking code, does still contain a block_on.