Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable CASE session establishment #7666

Merged
merged 12 commits into from
Jun 23, 2021

Conversation

pan-apple
Copy link
Contributor

@pan-apple pan-apple commented Jun 15, 2021

Problem

We are currently using secure sessions established via PASE for non commissioning cluster messages. This needs to be changed to CASE sessions.

Change overview

  1. Update CASESession code to use array of certs as operational certificates.
  2. Trigger CASE session establishment after the device has been discovered on operational network.
  3. Move session key ID assignment to a common class (as session ID space is shared by PASE and CASE sessions).
  4. Expire the secure session established via PASE while triggering the CASE sessions.
  5. Use secure sessions established via CASE for further messaging.

Testing

Tested flow using Python controller, chip-tool, and iOS CHIP Tool applications.

@todo
Copy link

todo bot commented Jun 15, 2021

- Match the generated cert against CSR and operational keypair

// TODO - Match the generated cert against CSR and operational keypair
// Make sure it chains back to the trusted root.
ChipLogProgress(Controller, "Generating operational certificate for the controller");
chipCertLen = 0;
ReturnErrorOnFailure(GenerateOperationalCertificates(ByteSpan(CSR.Get(), csrLength), mLocalDeviceId, chipCert.Get(),
chipCertAllocatedLen, chipCertLen));
ReturnErrorOnFailure(admin->SetOperationalCert(ByteSpan(chipCert.Get(), chipCertLen)));
}
ReturnErrorOnFailure(mAdmins.Store(admin->GetAdminId()));
}


This comment was generated by todo based on a TODO comment in 3b7cd86 in #7666. cc @pan-apple.

@todo
Copy link

todo bot commented Jun 15, 2021

- Update SessionID allocator to use freed session IDs

// TODO - Update SessionID allocator to use freed session IDs
mNextAvailable++;
return CHIP_NO_ERROR;
}
CHIP_ERROR SessionIDAllocator::Free(uint16_t id)
{
if (mNextAvailable == id && mNextAvailable > 0)
{
mNextAvailable--;


This comment was generated by todo based on a TODO comment in 3b7cd86 in #7666. cc @pan-apple.

@woody-apple
Copy link
Contributor

/rebase


for (uint32_t i = 0; i < kIterationCount && device->IsSessionSetupInProgress(); i++)
{
nanosleep(&sleep_time, nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we instead have some sort of notification from the device of 'session done'?

May not care much for test tools, however I imagine most other commisioners want to avoid sleeps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a TODO for this. This is big enough change to handle in a separate PR.

Copy link
Contributor

@mspang mspang Jun 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is missing synchronization here. We can show this with the following argument:

  1. Calling IsSessionSetupInProgress loads shared controller & protocol state
  2. The loop performs no stores that could change the state
  3. The loop synchronizes with no other thread that could change the state

Assuming those propositions, one of the following is true:

  1. Every iteration loads the same state (i.e., IsSessionSetupInProgress remains true)
  2. This code relies on a data race

And since data races are specified as undefined behavior in C++, all bets are off as to whether the resulting program does anything useful.

One correct implementation of waiting on a 2nd thread to do something is to synchronize access to the state with a mutex and use a condition variable for the "wait". Not to sleep; sleeping is not a useful primitive for waiting on a 2nd thread because it does not avoid the race that arises.

Fixing errors like this is not something that should be left as a TODO.

examples/chip-tool/commands/clusters/ModelCommand.cpp Outdated Show resolved Hide resolved
examples/chip-tool/commands/clusters/ModelCommand.cpp Outdated Show resolved Hide resolved
src/app/server/RendezvousServer.cpp Outdated Show resolved Hide resolved
src/app/server/RendezvousServer.h Show resolved Hide resolved
src/controller/CHIPDevice.cpp Show resolved Hide resolved
src/controller/CHIPDeviceController.cpp Outdated Show resolved Hide resolved
src/controller/CHIPDeviceController.cpp Outdated Show resolved Hide resolved
src/protocols/secure_channel/SessionIDAllocator.cpp Outdated Show resolved Hide resolved
src/protocols/secure_channel/CASEServer.cpp Show resolved Hide resolved
src/app/server/RendezvousServer.cpp Outdated Show resolved Hide resolved
src/controller/CHIPDevice.cpp Show resolved Hide resolved
@todo
Copy link

todo bot commented Jun 16, 2021

- Implement notification from device object when the secure session is available.

// TODO - Implement notification from device object when the secure session is available.
// Current code is polling the device object to check for secure session availability. This should
// be updated to a notification/callback mechanism.
err = WaitForSessionSetup(mDevice);
VerifyOrExit(err == CHIP_NO_ERROR,
ChipLogError(chipTool, "Timed out while waiting for session setup for device: %" PRIu64, localId));
err = SendCommand(mDevice, mEndPointId);
VerifyOrExit(err == CHIP_NO_ERROR, ChipLogError(chipTool, "Failed to send message: %s", ErrorStr(err)));
}


This comment was generated by todo based on a TODO comment in 95aba4d in #7666. cc @pan-apple.

@todo
Copy link

todo bot commented Jun 16, 2021

- Check if ID is already allocated in SessionIDAllocator::Reserve()

// TODO - Check if ID is already allocated in SessionIDAllocator::Reserve()
return CHIP_NO_ERROR;
}
uint16_t SessionIDAllocator::Peek()
{
return mNextAvailable;
}
} // namespace chip


This comment was generated by todo based on a TODO comment in 95aba4d in #7666. cc @pan-apple.

@pan-apple pan-apple requested a review from andy31415 June 16, 2021 21:24
@woody-apple
Copy link
Contributor

/rebase

@pan-apple pan-apple force-pushed the enable-case-session branch from 91a81b7 to 5373ba5 Compare June 22, 2021 16:21
@github-actions
Copy link

Size increase report for "gn_qpg6100-example-build" from 59e4a16

File Section File VM
chip-qpg6100-lighting-example.out .text 532 532
chip-qpg6100-lighting-example.out .bss 0 8
chip-qpg6100-lighting-example.out .heap 0 -8
Full report output
BLOAT REPORT

Files found only in the build output:
    report.csv

Comparing ./master_artifact/chip-qpg6100-lighting-example.out and ./pull_artifact/chip-qpg6100-lighting-example.out:

sections,vmsize,filesize
.debug_info,0,24911
.debug_line,0,2798
.debug_loc,0,1334
.debug_abbrev,0,987
.debug_str,0,954
.text,532,532
.debug_ranges,0,440
.strtab,0,396
.symtab,0,288
.debug_frame,0,280
.debug_aranges,0,96
.bss,8,0
.heap,-8,0
[Unmapped],0,-532

Comparing ./master_artifact/chip-qpg6100-lighting-example.out.map and ./pull_artifact/chip-qpg6100-lighting-example.out.map:

BLOAT EXECUTION FAILED WITH CODE 1:
bloaty: unknown file type for file './pull_artifact/chip-qpg6100-lighting-example.out.map'


@pan-apple pan-apple requested a review from mspang June 22, 2021 17:01
@github-actions
Copy link

Size increase report for "esp32-example-build" from 59e4a16

File Section File VM
chip-all-clusters-app.elf .flash.text 560 560
chip-all-clusters-app.elf .flash.rodata 64 64
chip-all-clusters-app.elf .dram0.bss 0 16
Full report output
BLOAT REPORT

Files found only in the build output:
    report.csv

Comparing ./master_artifact/chip-all-clusters-app.elf and ./pull_artifact/chip-all-clusters-app.elf:

sections,vmsize,filesize
.debug_info,0,29219
.debug_line,0,4083
.debug_abbrev,0,1263
.debug_str,0,953
.debug_loc,0,742
.flash.text,560,560
.shstrtab,0,462
.debug_ranges,0,352
.strtab,0,330
.debug_frame,0,232
.symtab,0,176
.xt.prop._ZN4chip9Transport15PeerConnectionsILj16ELNS_4Time6SourceE0EE23FindPeerConnectionStateEyPNS0_19PeerConnectionStateE,0,148
.debug_aranges,0,88
.xt.prop._ZN4chip9Transport19PeerConnectionStateC2EONS0_11PeerAddressE,0,76
.flash.rodata,64,64
.xt.lit._ZN4chip9Transport15PeerConnectionsILj16ELNS_4Time6SourceE0EE23FindPeerConnectionStateEyPNS0_19PeerConnectionStateE,0,48
.xt.lit._ZN4chip16SecureSessionMgr16IsControlMessageERNS_13PayloadHeaderE,0,40
.xt.prop._ZN4chip16SecureSessionMgr16IsControlMessageERNS_13PayloadHeaderE,0,40
.dram0.bss,16,0
[Unmapped],0,-64

Comparing ./master_artifact/chip-pigweed-app.elf and ./pull_artifact/chip-pigweed-app.elf:

sections,vmsize,filesize


@github-actions
Copy link

Size increase report for "nrfconnect-example-build" from 59e4a16

File Section File VM
chip-lock.elf text 456 456
chip-lock.elf rodata 64 60
chip-lock.elf [LOAD #3 [RW]] 0 30
chip-lock.elf device_handles 8 8
chip-lock.elf bss 0 2
chip-shell.elf text 4 4
chip-shell.elf device_handles -4 -4
Full report output
BLOAT REPORT

Files found only in the build output:
    report.csv

Comparing ./master_artifact/chip-lock.elf and ./pull_artifact/chip-lock.elf:

sections,vmsize,filesize
.debug_info,0,25267
.debug_line,0,2566
.debug_loc,0,1257
.debug_abbrev,0,1011
.debug_str,0,915
text,456,456
.strtab,0,396
.debug_ranges,0,360
.debug_frame,0,280
.symtab,0,240
.debug_aranges,0,96
rodata,60,64
[LOAD #3 [RW]],30,0
device_handles,8,8
bss,2,0

Comparing ./master_artifact/chip-shell.elf and ./pull_artifact/chip-shell.elf:

sections,vmsize,filesize
.debug_info,0,1930
.debug_loc,0,968
.debug_str,0,456
.debug_line,0,408
.debug_ranges,0,296
.debug_frame,0,140
.strtab,0,124
.symtab,0,64
.debug_aranges,0,32
.debug_abbrev,0,18
text,4,4
device_handles,-4,-4


@pan-apple
Copy link
Contributor Author

@mspang do you have any further comments?

Copy link
Contributor

@mspang mspang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mspang do you have any further comments?

I do but nothing major, they could be fixed in a followup.

{
ModelCommand * command = reinterpret_cast<ModelCommand *>(context);
VerifyOrReturn(command != nullptr,
ChipLogError(chipTool, "Device connected, but cannot send the command, as the context is null"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this happen (how)?

If not suggest VerifyOrDie(command != nullptr) or just leave it off since it's obvious anyway due to the call below.


CHIP_ERROR Device::EstablishConnectivity(Callback::Callback<OnDeviceConnected> * onConnection,
Callback::Callback<OnDeviceConnectionFailure> * onFailure)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we:

VerifyOrDie(!onConnection->IsRegistered());
VerifyOrDie(!onFailure->IsRegistered());

and document that the callback must not be registered?

(taking it as a Cancelable accomplishes similar but currently trades away the type safety which doesn't seem like a good tradeoff)

{
ModelCommand * command = reinterpret_cast<ModelCommand *>(context);
ChipLogError(chipTool, "Failed in connecting to the device %" PRIu64 ". Error %d", deviceId, error);
VerifyOrReturn(command != nullptr, ChipLogError(chipTool, "ModelCommand context is null"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VerifyOrDie(command != nullptr) here too.

@@ -475,35 +492,118 @@ bool Device::GetAddress(Inet::IPAddress & addr, uint16_t & port) const
return true;
}

CHIP_ERROR Device::EstablishCASESession()
void Device::OperationalCertProvisioned()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Suggest OnOperationalCertProvisioned as I think this reads a bit easier and we use that convention elsewhere.

* The function updates the state of device proxy object such that all subsequent messages
* will use secure session established via CASE handshake.
*/
CHIP_ERROR OperationalDiscoveryComplete(NodeId remoteDeviceId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Suggest OnOperationalDiscoveryComplete

void DeviceCommissioner::OnDeviceConnectedFn(void * context, Device * device)
{
DeviceCommissioner * commissioner = reinterpret_cast<DeviceCommissioner *>(context);
VerifyOrReturn(commissioner != nullptr, ChipLogProgress(Controller, "Device connected callback with null context. Ignoring"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can this happen? If it can't please use VerifyOrDie.

{
DeviceCommissioner * commissioner = reinterpret_cast<DeviceCommissioner *>(context);
ChipLogProgress(Controller, "Device connection failed. Error %s", ErrorStr(error));
VerifyOrReturn(commissioner != nullptr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can this happen? If it can't please use VerifyOrDie.

/**
* This function finds the device corresponding to deviceId, and establishes a secure connection with it.
* Once the connection is successfully establishes (or if it's already connected), it calls `onConnectedDevice`
* callback. If it fails to establish the connection, it calls `onError` callback.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. These callbacks must not be already registered with a source (i.e., if they were used previously, they must be canceled).

Callback::Callback<OnDeviceConnectionFailure> * onFailure)
{
CHIP_ERROR err = CHIP_NO_ERROR;
Device * device = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VerifyOrDie(!onConnection->IsRegistered());
VerifyOrDie(!onFailure->IsRegistered());

namespace chip {

class SessionIDAllocator
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document.

@pan-apple
Copy link
Contributor Author

@mspang do you have any further comments?

I do but nothing major, they could be fixed in a followup.

Thanks @mspang, I'll address these in the follow up PR.

@pan-apple pan-apple deleted the enable-case-session branch June 23, 2021 14:56
nikita-s-wrk pushed a commit to nikita-s-wrk/connectedhomeip that referenced this pull request Sep 23, 2021
* Enable CASE session establishment

* persist counter if device is connected

* enable CASE sessions for chip-tool

* address review comments

* add tests for SessionIDAllocator

* address review comments

* address review comments

* add ReserveUpTo

* release resouces on cleanup, and reduce message timeout

* release resources on cleanup and init.

* Remove sleep from ModelCommand

* some cleanup
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants