-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix cirque - update int/pointer casting and reset subscription counts in tests. #10771
Conversation
The returned device pointer is actually an int type, which meant that the function never actually waited for a callback and was racing on the device resolve when it attempted to send commands. Test: Saw regular cirque failures after device resolve. After this change, saw a resolve wait for the callback as expected.
The subscription itself causes a callback, which throws off the count between the number of commands and the check. The result is that the check ends, but the change thread continues on and causes weird races with the remaining tests. Resets the count right before the check and adds a thread join.
Fast track rationale: this changes test code, changes are small and explained/understood. |
while handler.subscriptionReceived < 5: | ||
# We should observe 10 attribute changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment seems to mismatch intent. Is there a bug here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably just a bug.
changeThread.start() | ||
with handler.cv: | ||
while handler.subscriptionReceived < 5: | ||
# We should observe 10 attribute changes | ||
handler.cv.wait() | ||
changeThread.join() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you comment on the join reason? Could this wait forever on failed test? Should we have a timeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That thread won't fail to join unless the send message call hangs, in which case we're already sunk in this test. The real challenge here is the handler.subscriptionReceived never hitting 5, which will happen if the two threads get out of sync with how many subscriptions they're looking for. But we already have that condition in here, and there's a timer on the tests anyway. I added this because before the thread was continuing along and sending message while other parts of the test were going and it was causing some interesting behaviour. That's rather a different problem (ie, we should add tests for what happens when you fire just a whole lot of requests to the node), but that's orthogonal to this issue.
There was a request to add a timeout to the join in project-chip#10771. The thread itself is unlikely to hang - more likely situation here is the subscription failing to cause updates, causing the main thread to hang forever on the condition variable. Added time outs to both so we can have a bit more information about what actually happened when the test completes.
There was a request to add a timeout to the join in project-chip#10771. The thread itself is unlikely to hang - more likely situation here is the subscription failing to cause updates, causing the main thread to hang forever on the condition variable. Added time outs to both so we can have a bit more information about what actually happened when the test completes.
…1297) * Address request from #10771 There was a request to add a timeout to the join in #10771. The thread itself is unlikely to hang - more likely situation here is the subscription failing to cause updates, causing the main thread to hang forever on the condition variable. Added time outs to both so we can have a bit more information about what actually happened when the test completes. * Restyled by autopep8 Co-authored-by: Restyled.io <[email protected]>
… in test (project-chip#11297) * Address request from project-chip#10771 There was a request to add a timeout to the join in project-chip#10771. The thread itself is unlikely to hang - more likely situation here is the subscription failing to cause updates, causing the main thread to hang forever on the condition variable. Added time outs to both so we can have a bit more information about what actually happened when the test completes. * Restyled by autopep8 Co-authored-by: Restyled.io <[email protected]>
Problem
Two commits
Change overview
Testing