-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling of mTestIndex in chip-tool tests is racy #7493
Comments
We could end up sending a message and getting a response to it before we ever incremented mTestIndex (if our call into NextTest() was on a thread other than the message thread). If that happened, we would end up running some subtest twice, and then later whenever we incrememented mTestIndex would end up skipping some subtest. Fixes project-chip#7493
We could end up sending a message and getting a response to it before we ever incremented mTestIndex (if our call into NextTest() was on a thread other than the message thread). If that happened, we would end up running some subtest twice, and then later whenever we incrememented mTestIndex would end up skipping some subtest. Fixes #7493
#7478 should fix it, since:
This mutual exclusion ensures safety.
It might be enough to protect |
Yes, I agree. I absolutely think we should fix #7478 and then can probably make mTestIndex non-atomic. In the meantime we were definitely getting random failures from the race on mTestIndex specifically... |
Yes. |
We could end up sending a message and getting a response to it before we ever incremented mTestIndex (if our call into NextTest() was on a thread other than the message thread). If that happened, we would end up running some subtest twice, and then later whenever we incrememented mTestIndex would end up skipping some subtest. Fixes project-chip#7493
Problem
https://github.com/cecille/connectedhomeip/runs/2785367140?check_suite_focus=true shows a test failure: We read the on/off attribute value after we should have sent an "off" command and get 1, not 0. The log shows this sequence of command executions:
Compare that to a passing run:
My best guess is that what happens is this:
RunNextTest
happens on the chip-tool main thread. It callsTestSendClusterOnOffCommandReadAttribute_0
and then loses the timeslice before incrementingmTestIndex
.NextTest()
. SincemTestIndex
is still 0, this runs the initial read-attribute test again and incrementsmTestIndex
to 1.TestSendClusterOnOffCommandOn_1
and incrementsmTestIndex
to 2.TestSendClusterOnOffCommandReadAttribute_2
(the "Check on/off attribute value is true after on command") bit. IncrementsmTestIndex
to 3.mTestIndex
that never happened.mTestIndex
is now 4.TestSendClusterOnOffCommandReadAttribute_4
(which is "Check on/off attribute value is false after off"), but we never did the part of the test that should send the Off command, so we fail.Proposed Solution
#7478 would presumably fix this.
If we want a shorter-term fix, need to figure out how to ensure that mTestIndex gets incremented before we start sending packets for that test, not after. Probably just using
std::atomic_uint16_t
would be enough.The text was updated successfully, but these errors were encountered: