Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CASESession needs to close its exchange when done #7070

Closed
bzbarsky-apple opened this issue May 24, 2021 · 0 comments · Fixed by #7103
Closed

CASESession needs to close its exchange when done #7070

bzbarsky-apple opened this issue May 24, 2021 · 0 comments · Fixed by #7103
Assignees

Comments

@bzbarsky-apple
Copy link
Contributor

Problem

Steps to reproduce:

  1. Compile all-clusters-app for m5stack in BLE pairing mode
  2. Edit PairingCommand::UpdateNetworkAddress to set mFabricId = 0; up front to work around the device advertising on the wrong fabric for now.
  3. Comment out the ec->Close() call in ExchangeManager::OnConnectionExpired to work around ExchangeManager::OnConnectionExpired is incorrectly closing exchanges, can cause lost messages or crashes #7012
  4. Apply the changes from [controller] Use short lived CommandSender when no command sender is provided #7041
  5. ./gn_build.sh to compile command-line chip-tool.
  6. Run ./out/debug/standalone/chip-tool pairing SSID PASSWORD 112233 12345678 3840
  7. Wait for that to complete.
  8. Run ./out/ebug/standalone/chip-tool onoff toggle 1

(Can probably reproduce this in bypass mode too, honestly, as long as that does CASE.)

chip-tool crashes. This happens because the CASESession holds on to an exchange until its destructor, which causes it to outlive shutdown and destruction of various globals like the SessionManager. The assert about this in ExchangeManager::Shutdown never gets hit because that shutdown call is commented out in DeviceController::Shutdown. Then when we finally go to Close it, it tries to CancelResponseTimer, which does mExchangeMgr->GetSessionMgr()->SystemLayer() but the SessionManager is dead so we get a garbage pointer and crash when we dereference it.

Proposed Solution

Once #7054 lands, we should:

  1. Close the exchange in CASESession once we are done with it.
  2. Uncomment that Shutdown call, assuming that passes CI.
  3. See whether we can add some sort of CI for this. For example, we could run the "linux" version of all-clusters-app like Darwin CI does it and then maybe we can just run command-line chip-tool and ensure it does not crash? @vivien-apple how hard is this part to set up?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants