-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Properly handle crashes/reboots during FabricTable commit #20010
Merged
tcarmelveilleux
merged 14 commits into
project-chip:master
from
tcarmelveilleux:remove-stale-fail-safe
Jun 29, 2022
Merged
Properly handle crashes/reboots during FabricTable commit #20010
tcarmelveilleux
merged 14 commits into
project-chip:master
from
tcarmelveilleux:remove-stale-fail-safe
Jun 29, 2022
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Since project-chip#19819, commits are very small and safer. There is less surface to fail during commit. The previous large-scale fail-safe behavior stored too much data, for too long and could cause larger reverts even if nothing was committed yet. FabricTable data no longer is ever persisted without commit. - The existing code deleted fabrics unwittingly when not required, such as when powering off a light during a fail-safe for an update when there was nothing committed yet, assuming we still committed immediately. This change: - Detects failed commits - Only deletes data on failed commits - Fixes Thread driver to detect stale data where a backup was done (since we cannot prevent internal commits from OpenThread) Testing done: - Added unit test to FabricTable to generate the condition - Did manual testing of all-clusters-app/chip-tool Linux that aborted on second commissioning, during commit. Found that cleanup occurred as expected on restart - Integration/cert testing (including Cirque that validates fail-safe) still pass
pullapprove
bot
requested review from
andy31415,
anush-apple,
arkq,
Byungjoo-Lee,
bzbarsky-apple,
carol-apple,
chrisdecenzo,
chshu,
chulspro,
Damian-Nordic,
dhrishi,
electrocucaracha,
erjiaqing,
franck-apple,
gjc13,
harimau-qirex,
harsha-rajendran,
hawk248,
isiu-apple,
jelderton,
jepenven-silabs,
jmartinez-silabs,
jtung-apple,
kghost and
lazarkov
June 27, 2022 19:52
chrisdecenzo
approved these changes
Jun 28, 2022
PR #20010: Size comparison from e80724e to 31ba8a5 Increases (8 builds for cc13x2_26x2, linux)
Decreases (37 builds for cc13x2_26x2, cyw30739, efr32, k32w, linux, mbed, p6, telink)
Full report (37 builds for cc13x2_26x2, cyw30739, efr32, k32w, linux, mbed, p6, telink)
|
src/platform/OpenThread/GenericNetworkCommissioningThreadDriver.cpp
Outdated
Show resolved
Hide resolved
src/platform/OpenThread/GenericNetworkCommissioningThreadDriver.cpp
Outdated
Show resolved
Hide resolved
tcarmelveilleux
force-pushed
the
remove-stale-fail-safe
branch
from
June 28, 2022 14:47
1f87aa6
to
5034a72
Compare
msandstedt
reviewed
Jun 28, 2022
Damian-Nordic
approved these changes
Jun 28, 2022
tehampson
reviewed
Jun 28, 2022
src/platform/OpenThread/GenericNetworkCommissioningThreadDriver.cpp
Outdated
Show resolved
Hide resolved
PR #20010: Size comparison from 6e6892a to 5994347 Increases (37 builds for cc13x2_26x2, cyw30739, efr32, esp32, k32w, linux, mbed, p6, telink)
Decreases (25 builds for cc13x2_26x2, cyw30739, esp32, linux, mbed, p6)
Full report (39 builds for cc13x2_26x2, cyw30739, efr32, esp32, k32w, linux, mbed, p6, telink)
|
bzbarsky-apple
approved these changes
Jun 28, 2022
PR #20010: Size comparison from 6e6892a to 3608fda Increases above 0.2%:
Increases (37 builds for cc13x2_26x2, cyw30739, efr32, esp32, k32w, linux, mbed, p6, telink)
Decreases (25 builds for cc13x2_26x2, cyw30739, esp32, linux, mbed, p6)
Full report (39 builds for cc13x2_26x2, cyw30739, efr32, esp32, k32w, linux, mbed, p6, telink)
|
tehampson
approved these changes
Jun 28, 2022
PR #20010: Size comparison from 6e6892a to 39caecc Increases above 0.2%:
Increases (37 builds for cc13x2_26x2, cyw30739, efr32, esp32, k32w, linux, mbed, p6, telink)
Decreases (25 builds for cc13x2_26x2, cyw30739, esp32, linux, mbed, p6)
Full report (39 builds for cc13x2_26x2, cyw30739, efr32, esp32, k32w, linux, mbed, p6, telink)
|
msandstedt
approved these changes
Jun 28, 2022
PR #20010: Size comparison from 6e6892a to a9eaa45 Increases (10 builds for cyw30739, k32w, linux, p6, telink)
Decreases (12 builds for cyw30739, k32w, linux, mbed, p6, telink)
Full report (14 builds for cyw30739, k32w, linux, mbed, p6, telink)
|
PR #20010: Size comparison from 6e6892a to 5c2b5be Increases above 0.2%:
Increases (45 builds for cc13x2_26x2, cyw30739, efr32, esp32, k32w, linux, nrfconnect, p6, telink)
Decreases (45 builds for cc13x2_26x2, cyw30739, efr32, esp32, k32w, linux, mbed, nrfconnect, p6, telink)
Full report (49 builds for cc13x2_26x2, cyw30739, efr32, esp32, k32w, linux, mbed, nrfconnect, p6, telink)
|
tcarmelveilleux
added a commit
to tcarmelveilleux/connectedhomeip
that referenced
this pull request
Jun 29, 2022
- TestFailSafeContext is now crashy, even though it was just only recently moved in project-chip#20010 and no functional changes were done. - This may be related to other flakiness seen recently on random tests. This PR disables tests that are mostly covered by other test cases and for trivial logic that we won't touch tomorrow. Issue project-chip#20089
tcarmelveilleux
added a commit
to tcarmelveilleux/connectedhomeip
that referenced
this pull request
Jun 29, 2022
- TestFailSafeContext is now crashy, even though it was just only recently moved in project-chip#20010 and no functional changes were done. - This may be related to other flakiness seen recently on random tests. This PR disables tests that are mostly covered by other test cases and for trivial logic that we won't touch tomorrow. Issue project-chip#20089
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
to fail during commit. The previous large-scale fail-safe behavior
stored too much data, for too long and could cause larger reverts
even if nothing was committed yet. FabricTable data no longer is
ever persisted without commit.
such as when powering off a light during a fail-safe for an
update when there was nothing committed yet, assuming we still
committed immediately.
Issue #19935
Change overview
(since we cannot prevent internal commits from OpenThread)
Testing
aborted on second commissioning, during commit. Found that
cleanup occurred as expected on restart
fail-safe) still pass