-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some commands can take longer in sync execution time than the MRP timeout #19132
Comments
Also, the newly added IM timeout of 2 seconds + MRP bits is probably too low for these too... @mrjerryjohns @erjiaqing |
The caller can still set the timeout for each command they send. The 2s timeout should cover 99% cases. Since only cryptographic bits and some async commands are using longer time. |
Yes, but it's not trivial. And in fact, the 2s change broke commissioning... Anyway, that's a separate thing from what we do about MRP for these commands. |
Although the storage fragmentation bug has been fixed, back then only an MRP of 5s+2s on both client and server could make it reliably past commissioning (changing it on one side had no effect). Also, RemoveFabric was just as slow as AddNOC/UpdateNOC because it has to do a lot of cleanup. |
This is going to somewhat change what's happening on the wire (not at the protocol level, but in terms of actual messages sent), when fixed... |
SDK Review: Given this will make the test harness more resilient to a variety of DUTs, we believe this is appropriate for the SVE branch. |
- UpdateNOC did not clear session of previous fabric like spec intended (project-chip#20379) - All OpCreds cluster slow commands did not try to early-ack to avoid MRP timeouts (project-chip#19132) Fixes project-chip#20379 Fixes project-chip#19132 This PR: - Fixes UpdateNOC expiring all sessions for the updated node - Adds `FlushAcksRightAwayOnSlowCommand` to CommandHandler to flush acks on slow commands - Adds Python tests for UpdateNOC behavior of session expiring - Adds `ExpireSessions` to Python for testing Testing done: - Unit tests all pass - Cert tests pass - With the session clearing, previous Python tests failed, until I fixed them with the new `ExpireSessions` API - Observed standalone acks immediately sent on opcreds cluster commands
* Fix UpdateNOC session invalidation and opcreds timing - UpdateNOC did not clear session of previous fabric like spec intended (#20379) - All OpCreds cluster slow commands did not try to early-ack to avoid MRP timeouts (#19132) Fixes #20379 Fixes #19132 This PR: - Fixes UpdateNOC expiring all sessions for the updated node - Adds `FlushAcksRightAwayOnSlowCommand` to CommandHandler to flush acks on slow commands - Adds Python tests for UpdateNOC behavior of session expiring - Adds `ExpireSessions` to Python for testing Testing done: - Unit tests all pass - Cert tests pass - With the session clearing, previous Python tests failed, until I fixed them with the new `ExpireSessions` API - Observed standalone acks immediately sent on opcreds cluster commands * Restyled * Removed leftover debug
* Fix UpdateNOC session invalidation and opcreds timing - UpdateNOC did not clear session of previous fabric like spec intended (#20379) - All OpCreds cluster slow commands did not try to early-ack to avoid MRP timeouts (#19132) Fixes #20379 Fixes #19132 This PR: - Fixes UpdateNOC expiring all sessions for the updated node - Adds `FlushAcksRightAwayOnSlowCommand` to CommandHandler to flush acks on slow commands - Adds Python tests for UpdateNOC behavior of session expiring - Adds `ExpireSessions` to Python for testing Testing done: - Unit tests all pass - Cert tests pass - With the session clearing, previous Python tests failed, until I fixed them with the new `ExpireSessions` API - Observed standalone acks immediately sent on opcreds cluster commands * Restyled * Removed leftover debug
* Fix UpdateNOC session invalidation and opcreds timing - UpdateNOC did not clear session of previous fabric like spec intended (#20379) - All OpCreds cluster slow commands did not try to early-ack to avoid MRP timeouts (#19132) Fixes #20379 Fixes #19132 This PR: - Fixes UpdateNOC expiring all sessions for the updated node - Adds `FlushAcksRightAwayOnSlowCommand` to CommandHandler to flush acks on slow commands - Adds Python tests for UpdateNOC behavior of session expiring - Adds `ExpireSessions` to Python for testing Testing done: - Unit tests all pass - Cert tests pass - With the session clearing, previous Python tests failed, until I fixed them with the new `ExpireSessions` API - Observed standalone acks immediately sent on opcreds cluster commands * Restyled * Removed leftover debug Co-authored-by: Tennessee Carmel-Veilleux <[email protected]>
Problem
Some commands take longer than the MRP timeout to execute. Their execution then blocks standalone acks from being sent, and the sender keeps resending.
We have run into this with AddNOC/UpdateNOC (due to the crypto bits they have to do).
I believe @bluebin14 has also run into it for some commands that touched storage when storage was being (due to a bug) very slow. But in general, we don't have a good idea of how long sync storage might take.
Proposed Solution
For commands that involve known slow crypto, we should probably just add manual
FlushAcks()
calls.For others, not clear.
The text was updated successfully, but these errors were encountered: