CASESession needs to close its exchange when done #7070

bzbarsky-apple · 2021-05-24T22:35:56Z

Problem

Steps to reproduce:

Compile all-clusters-app for m5stack in BLE pairing mode
Edit PairingCommand::UpdateNetworkAddress to set mFabricId = 0; up front to work around the device advertising on the wrong fabric for now.
Comment out the ec->Close() call in ExchangeManager::OnConnectionExpired to work around ExchangeManager::OnConnectionExpired is incorrectly closing exchanges, can cause lost messages or crashes #7012
Apply the changes from [controller] Use short lived CommandSender when no command sender is provided #7041
./gn_build.sh to compile command-line chip-tool.
Run ./out/debug/standalone/chip-tool pairing SSID PASSWORD 112233 12345678 3840
Wait for that to complete.
Run ./out/ebug/standalone/chip-tool onoff toggle 1

(Can probably reproduce this in bypass mode too, honestly, as long as that does CASE.)

chip-tool crashes. This happens because the CASESession holds on to an exchange until its destructor, which causes it to outlive shutdown and destruction of various globals like the SessionManager. The assert about this in ExchangeManager::Shutdown never gets hit because that shutdown call is commented out in DeviceController::Shutdown. Then when we finally go to Close it, it tries to CancelResponseTimer, which does mExchangeMgr->GetSessionMgr()->SystemLayer() but the SessionManager is dead so we get a garbage pointer and crash when we dereference it.

Proposed Solution

Once #7054 lands, we should:

Close the exchange in CASESession once we are done with it.
Uncomment that Shutdown call, assuming that passes CI.
See whether we can add some sort of CI for this. For example, we could run the "linux" version of all-clusters-app like Darwin CI does it and then maybe we can just run command-line chip-tool and ensure it does not crash? @vivien-apple how hard is this part to set up?

The text was updated successfully, but these errors were encountered:

bzbarsky-apple assigned pan-apple May 24, 2021

pan-apple mentioned this issue May 25, 2021

Cleanup CASE state machine for error handling and logging #7103

Merged

bzbarsky-apple mentioned this issue Jun 1, 2021

Move serialization of chip::Device earlier. #7218

Merged

bzbarsky-apple closed this as completed in #7103 Jun 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CASESession needs to close its exchange when done #7070

CASESession needs to close its exchange when done #7070

bzbarsky-apple commented May 24, 2021

CASESession needs to close its exchange when done #7070

CASESession needs to close its exchange when done #7070

Comments

bzbarsky-apple commented May 24, 2021

Problem

Proposed Solution