Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update AddNOC to follow final error handling spec text #18633

Closed
tcarmelveilleux opened this issue May 19, 2022 · 3 comments · Fixed by #19819
Closed

Update AddNOC to follow final error handling spec text #18633

tcarmelveilleux opened this issue May 19, 2022 · 3 comments · Fixed by #19819
Assignees
Labels
cert blocker secure channel spec Mismatch between spec and implementation V1.0

Comments

@tcarmelveilleux
Copy link
Contributor

Problem

In https://github.com/CHIP-Specifications/connectedhomeip-spec/pull/5231, error handling got normalized to follow reality. The updates need to be taken in once landed in spec. This was a Must-Fix 1.0 spec issue

Proposed Solution

  • Rename MissingAdminNode enum value to InvalidAdminSubject
  • Rename CaseAdminNode command field to CaseAdminSubject
  • Remove MissingIpk enum value
@andy31415
Copy link
Contributor

@cjandhyala @tcarmelveilleux @bzbarsky-apple - is this actually a cert-blocker?

The way I read it, these are enum value renames ... do these enum values show up in cert tests? they sound internal to me, however I am not familiar with them.

@woody-apple
Copy link
Contributor

Cert Blocker Review: @cjandhyala is updating the test plan to fully cover the test, which will make this a cert blocker.

@bzbarsky-apple
Copy link
Contributor

@andy31415 These are spec-defined enums; they are not internal, but sent on the wire. And it's not just renames, it's also which is returned when.

tcarmelveilleux added a commit to tcarmelveilleux/connectedhomeip that referenced this issue May 30, 2022
- Add all missing IM-level checks that would lead to
  INVALID_COMMAND error on incorrectly provided inputs
- Normalize logging of errors some more
- Add missing interlocks on fail-safe
- Remove RemoveTrustedRootCertificate command
- Ensure error-handling in AddTrustedRootCertificate
  does not destroy in-progress state on innocuous errors
- Move some IM structural checks before fail-safe checks
  in AddNOC/UpdateNOC

Issue project-chip#18633
Fixes project-chip#15311
Fixes project-chip#17209
tcarmelveilleux added a commit that referenced this issue Jun 1, 2022
* Multiple spec compliance fixes to NOC cluster

- Add all missing IM-level checks that would lead to
  INVALID_COMMAND error on incorrectly provided inputs
- Normalize logging of errors some more
- Add missing interlocks on fail-safe
- Remove RemoveTrustedRootCertificate command
- Ensure error-handling in AddTrustedRootCertificate
  does not destroy in-progress state on innocuous errors
- Move some IM structural checks before fail-safe checks
  in AddNOC/UpdateNOC

Issue #18633
Fixes #15311
Fixes #17209

* ZAP changes

* Restyled by clang-format

* regen zap

* Restyled by clang-format

* Add missing root cert size validation

* Fix CI

* Restyled by clang-format

* Fix CI

* Apply review comments

* ZAP regen

* Restyled by whitespace

* Restyled by clang-format

* Fix a constant move to wrong context

* Add FAILSAFE_REQUIRED and TIMED_REQUEST_MISMATCH

* ZAP regen

* Restyled by whitespace

* Restyled by clang-format

Co-authored-by: Restyled.io <[email protected]>
tcarmelveilleux added a commit to tcarmelveilleux/connectedhomeip that referenced this issue Jun 7, 2022
- Fail-safe did not properly manage the roll-back of operational keys
- Operational key storage being centralized by value in FabricTable
  prevented ability to back keys by hardware/OS and allow the rollback
  of keys on failsafe expiry
- CASE code was using "raw" FabricInfo * which could go stale on UpdateNOC
  or after fail-safe expiry.

This PR:
- Adds an OperationalKeystore interface
- Make the FabricTable use the OperationalKeystore for when
  a commissionable node (with Opcreds cluster) is being commissioned
- Retain legacy controller behavior that allows injection of operational
  keys
- Simplifies the fail-safe handling lifecycle
- Add logging to fail-safe handling
- Add logging to general commissioning cluster
- Make CASE use ScopedNodeId everywhere
- Implement IsForUpdateNOC in fail-safe and opcreds cluster

Fixes project-chip#19072
Issue project-chip#18633
Fixes project-chip#16443
tcarmelveilleux added a commit that referenced this issue Jun 10, 2022
* Properly manage operational key lifecycle for fail-safe

- Fail-safe did not properly manage the roll-back of operational keys
- Operational key storage being centralized by value in FabricTable
  prevented ability to back keys by hardware/OS and allow the rollback
  of keys on failsafe expiry
- CASE code was using "raw" FabricInfo * which could go stale on UpdateNOC
  or after fail-safe expiry.

This PR:
- Adds an OperationalKeystore interface
- Make the FabricTable use the OperationalKeystore for when
  a commissionable node (with Opcreds cluster) is being commissioned
- Retain legacy controller behavior that allows injection of operational
  keys
- Simplifies the fail-safe handling lifecycle
- Add logging to fail-safe handling
- Add logging to general commissioning cluster
- Make CASE use ScopedNodeId everywhere
- Implement IsForUpdateNOC in fail-safe and opcreds cluster

Fixes #19072
Issue #18633
Fixes #16443

* Fix merge of upstream

* Restyled by whitespace

* Restyled by clang-format

* Revert unintended testing changes

* Add remove operation

* Fix CI and add tests to support further tests

* Fix more CI

* Restyled by clang-format

* Darwin changes to use the new setup

* Added unit test and HasOpKeypairForFabric()

* Restyled by clang-format

* Restyled by gn

* Apply review comments from @msandstedt

* Add plumbing for init of controllers

* Restyled by clang-format

* Fix darwin tests

* Fix CI and address review comments

* Fix comment typos

* Apply review comments from @bzbarsky-apple and @tehampson

* Restyled by clang-format

* Fix more comments

* Restyled by clang-format

* Fix CI

* Fix cirque

* Restyled by clang-format

* Update src/crypto/tests/TestPersistentStorageOpKeyStore.cpp

Co-authored-by: tehampson <[email protected]>

* Address review comments

* Fix CI

* More clang-tidy fixes

Co-authored-by: Restyled.io <[email protected]>
Co-authored-by: Boris Zbarsky <[email protected]>
Co-authored-by: Justin Wood <[email protected]>
Co-authored-by: tehampson <[email protected]>
tcarmelveilleux added a commit to tcarmelveilleux/connectedhomeip that referenced this issue Jun 15, 2022
- This PR is an intermediate fully unit-tested step towards replacing
  all NOC/ICAC/RCAC storage from Fabric-Table to allow spec-compliant
  fail-safe implementation
- Right now, we have hacky code in opcreds cluster server and in FabricTable
  to attempt to affect fail-safe properly. It doesn't actually work and
  prevents the TrustedRootCertificates, NOCs and Fabrics attributes from
  being implemented properly, and prevents UpdateNOC from being implemented
  to spec

Issue project-chip#18633
Issue project-chip#17208
Issue project-chip#15585

This PR implements an operational cert storage interface and provides
an implementation based on exact storage from existing FabricTable
for backwards compatibility. Its usage in FabricTable and via
opcreds cluster is a follow-up.

Testing done:
- Added large-scale new code unit test
- Unit tests pass
- Not hooked-up to device software yet, so just the unit tests need to pass
andy31415 pushed a commit that referenced this issue Jun 16, 2022
* Introduce fail-safe compliant Operational Cert storage

- This PR is an intermediate fully unit-tested step towards replacing
  all NOC/ICAC/RCAC storage from Fabric-Table to allow spec-compliant
  fail-safe implementation
- Right now, we have hacky code in opcreds cluster server and in FabricTable
  to attempt to affect fail-safe properly. It doesn't actually work and
  prevents the TrustedRootCertificates, NOCs and Fabrics attributes from
  being implemented properly, and prevents UpdateNOC from being implemented
  to spec

Issue #18633
Issue #17208
Issue #15585

This PR implements an operational cert storage interface and provides
an implementation based on exact storage from existing FabricTable
for backwards compatibility. Its usage in FabricTable and via
opcreds cluster is a follow-up.

Testing done:
- Added large-scale new code unit test
- Unit tests pass
- Not hooked-up to device software yet, so just the unit tests need to pass

* Restyled by clang-format

* Remove 2 unused variables in unit test to fix CI

Co-authored-by: Restyled.io <[email protected]>
tcarmelveilleux added a commit to tcarmelveilleux/connectedhomeip that referenced this issue Jun 21, 2022
- FabricTable management during commissioning did not properly
  handle the fact that committing needs to only be done on
  CommissioningComplete, which prevented the AddTrustedRootCertificate,
  UpdateNOC and AddNOC command semantics to be implemented properly
  and prevented proper state observation of operational credential
  clusters server

Fixes project-chip#7695
Issue project-chip#8905
Fixes project-chip#18633
Issue project-chip#17208
Fixes project-chip#15585

This PR:

- Removes direct access to FabricInfo from everywhere, which caused
  possibly stale FabricInfo references during commissioning.
- Remove immediate committing of fabric table on UpdateNOC.
- Make Fabrics, NOCs and TrustedRootCertificates attributes reflect
  proper partial state during fail-safe, by using the shadow data
  capabilities of OperationalCertificateStore and by updates to
  FabricInfo
- Make it possible to unit test fabric table by providing the necessary
  lifecycle public APIs to test every modality of the commissioning flow
- Make Server and DeviceController use OperationalCertificateStore to
  allow proper external lifecycle management of the operational cert
  chain.
- Update all examples/controller code to new API
- Remove dangerous internal APIs from FabricTable and replace with
  direct accessors where needed
- Add more of the necessary spec validations to the UpdateNOC and AddNOC
  flows

Testing done:
- Updated all unit tests, all pass
- Cert tests still pass as before
- Working on further integration tests and unit tests as a follow-up
  noting that current state has not regressed on existing test coverage,
  and that new usage of OperationalCertificateStore class in FabricTable
  gains a large amount of additional coverage transitively via some
  of the existing tests making use of FabricTable.
tcarmelveilleux added a commit that referenced this issue Jun 24, 2022
* Implement shadow fail-safe data in FabricTable

- FabricTable management during commissioning did not properly
  handle the fact that committing needs to only be done on
  CommissioningComplete, which prevented the AddTrustedRootCertificate,
  UpdateNOC and AddNOC command semantics to be implemented properly
  and prevented proper state observation of operational credential
  clusters server

Fixes #7695
Issue #8905
Fixes #18633
Issue #17208
Fixes #15585

This PR:

- Removes direct access to FabricInfo from everywhere, which caused
  possibly stale FabricInfo references during commissioning.
- Remove immediate committing of fabric table on UpdateNOC.
- Make Fabrics, NOCs and TrustedRootCertificates attributes reflect
  proper partial state during fail-safe, by using the shadow data
  capabilities of OperationalCertificateStore and by updates to
  FabricInfo
- Make it possible to unit test fabric table by providing the necessary
  lifecycle public APIs to test every modality of the commissioning flow
- Make Server and DeviceController use OperationalCertificateStore to
  allow proper external lifecycle management of the operational cert
  chain.
- Update all examples/controller code to new API
- Remove dangerous internal APIs from FabricTable and replace with
  direct accessors where needed
- Add more of the necessary spec validations to the UpdateNOC and AddNOC
  flows

Testing done:
- Updated all unit tests, all pass
- Cert tests still pass as before
- Working on further integration tests and unit tests as a follow-up
  noting that current state has not regressed on existing test coverage,
  and that new usage of OperationalCertificateStore class in FabricTable
  gains a large amount of additional coverage transitively via some
  of the existing tests making use of FabricTable.

* Restyled by clang-format

* Fixes after merging master

* Rename callback of FabricTable::Delegate

- Remove needless ones
- Make the callbacks do nothing by default to avoid more
  "Intentionally left blank"
- Renamed to actually reflect what is happening

* Rekick restyle

* restyle

* Align last known good time commit / revert to shadow fail-safe approach

Fabric commit / revert is being refactored to only persist fabric
data when we actually commit on  receipt of CommissioningComplete.
This reworks Last known Good Time to use the same strategy.

Now, instead of persisting both last known good time and a fail-safe
backup at the time of certificate installation, a single, updated
last known good time is stored in RAM at the time of certificate
installation.  Then, on commit, this is persisted, but never before.

* Fix CI

* Fix TV app

* Reduce flash usage on K32W0 by disabling detail (verbose) logs

* Disable progress logging in TI platform lock example to save flash

* Disable progress logging in TI platform shell example to save flash

* Reduce stack usage of TestEventLogging

* Apply review comments from @msandstedt

* Save stack for Nordic build

* Save stack for Nordic build, some more

* Added unit tests for all basic operations

* Fix merge conflict

* Restyle

* Change nrf connect stack limit up by 512 bytes for fake unit test comparisons

* Revert "Save stack for Nordic build"

This reverts commit 52699c4.

* Revert "Save stack for Nordic build, some more"

This reverts commit e62391e.

* Fix UpdateLabel

* Apply review comments, add ACL fabric removal

* Apply review comments

* per bzbarsky, clear in-memory Last Known Good Time if revert fails

* Apply fixes from @bzbarsky-apple

* Apply more review comments

* Restyled

* Fix tests

* Add unit test for iterator

* Iterator now works

* Fix semantic merge conflict

* Restyled

* Fix predicates for storage presence temporarily

issue #16958

* Fix Darwin build

* Add missing docs, move methods to private that shouldn't be public

* Change stack warning temporarily to pass on nRFConnect

* Make MatterControllerFactory use const FabricInfo

* Restyle

* Fix semantic conflict on SessionManager

* Fix an init ordering issue in TestSessionManager.cpp

* Fix SessionManager shutdown

* Restyled

Co-authored-by: Restyled.io <[email protected]>
Co-authored-by: Michael Sandstedt <[email protected]>
Co-authored-by: Boris Zbarsky <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cert blocker secure channel spec Mismatch between spec and implementation V1.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants