Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExchangeContext dispatch leak for non-application exchanges #9380

Closed
msandstedt opened this issue Aug 31, 2021 · 7 comments
Closed

ExchangeContext dispatch leak for non-application exchanges #9380

msandstedt opened this issue Aug 31, 2021 · 7 comments
Assignees
Labels
leak Memory leak bug V1.0

Comments

@msandstedt
Copy link
Contributor

Problem

ExchangeContext::ExchangeContext() -- mDispatch is derived from mDelegate with mDelegate being set via an input argument. When exchanges are destroyed, the ExchangeContext's mDelegate is set to nullptr, but mDispatch is untouched. This will result in a crash since in all likelihood, when mDelegate is set to nullptr, it can be assumed the derived mDispatch is also destroyed as a result, but not updated.

Proposed Solution

The following workaround is effective, but the real solution is probably some generally improved lifecycle management in and around the ExchangeContext object.

diff --git a/src/messaging/ExchangeContext.cpp b/src/messaging/ExchangeContext.cpp
index b5469d26e..76cbd57ef 100644
--- a/src/messaging/ExchangeContext.cpp
+++ b/src/messaging/ExchangeContext.cpp
@@ -247,11 +247,13 @@ ExchangeContext::ExchangeContext(ExchangeManager * em, uint16_t ExchangeId, Sess
         if (dispatch == nullptr)
         {
             dispatch = &em->mDefaultExchangeDispatch;
+            mDelegateDefaultedToApplicationExchangeDispatch = true;
         }
     }
     else
     {
         dispatch = &em->mDefaultExchangeDispatch;
+        mDelegateDefaultedToApplicationExchangeDispatch = true;
     }
     VerifyOrDie(dispatch != nullptr);
     mDispatch = dispatch->Retain();
@@ -269,6 +271,20 @@ ExchangeContext::ExchangeContext(ExchangeManager * em, uint16_t ExchangeId, Sess
 
 ExchangeContext::~ExchangeContext()
 {
+    if (mDelegate == nullptr && !mDelegateDefaultedToApplicationExchangeDispatch)
+    {
+        // in ExchangeContext::ExchangeContext() -- mDispatch is derived from mDelegate with mDelegate being set
+        // via an input argument. When exchanges are destroyed, the ExchangeContext's mDelegate is set to nullptr,
+        // but mDispatch is untouched, this will result in a crash since in all likelihood, when mDelegate is set
+        // to nullptr, it can be assumed the derived mDispatch is also destroyed as a result, but not updated.
+        ChipLogProgress(ExchangeManager, "~ExchangeContext: %u, mDelegate is null for non Application exchanges. Setting dispatch to null", mExchangeId);
+        mDispatch = nullptr;
+    }
     VerifyOrDie(mExchangeMgr != nullptr && GetReferenceCount() == 0);
     VerifyOrDie(!IsAckPending());
 
@@ -389,6 +405,20 @@ CHIP_ERROR ExchangeContext::HandleMessage(const PacketHeader & packetHeader, con
                                           const Transport::PeerAddress & peerAddress, MessageFlags msgFlags,
                                           PacketBufferHandle && msgBuf)
 {
+    if (mDelegate == nullptr && !mDelegateDefaultedToApplicationExchangeDispatch)
+    {
+        // in ExchangeContext::ExchangeContext() -- mDispatch is derived from mDelegate with mDelegate being set
+        // via an input argument. When exchanges are destroyed, the ExchangeContext's mDelegate is set to nullptr,
+        // but mDispatch is untouched, this will result in a crash since in all likelihood, when mDelegate is set
+        // to nullptr, it can be assumed the derived mDispatch is also destroyed as a result, but not updated.
+
+        ChipLogProgress(ExchangeManager, "HandleMessage: %u, mDelegate is null for non Application exchanges. Setting dispatch to null", mExchangeId);
+        mDispatch = nullptr;
+    }
     // We hold a reference to the ExchangeContext here to
     // guard against Close() calls(decrementing the reference
     // count) by the protocol before the CHIP Exchange
@@ -427,8 +457,11 @@ CHIP_ERROR ExchangeContext::HandleMessage(const PacketHeader & packetHeader, con
         MessageHandled();
     });
 
-    ReturnErrorOnFailure(mDispatch->OnMessageReceived(packetHeader.GetFlags(), payloadHeader, packetHeader.GetMessageId(),
-                                                      peerAddress, msgFlags, GetReliableMessageContext()));
+    if (mDispatch != nullptr)
+    {
+        ReturnErrorOnFailure(mDispatch->OnMessageReceived(packetHeader.GetFlags(), payloadHeader, packetHeader.GetMessageId(),
+                                                          peerAddress, msgFlags, GetReliableMessageContext()));
+    }
 
     if (IsAckPending() && !mDelegate)
     {
diff --git a/src/messaging/ExchangeContext.h b/src/messaging/ExchangeContext.h
index a70b2bfcb..71b7e3909 100644
--- a/src/messaging/ExchangeContext.h
+++ b/src/messaging/ExchangeContext.h
@@ -184,6 +184,7 @@ public:
 private:
     Timeout mResponseTimeout       = 0; // Maximum time to wait for response (in milliseconds); 0 disables response timeout.
     ExchangeDelegate * mDelegate   = nullptr;
+    bool mDelegateDefaultedToApplicationExchangeDispatch = false;
     ExchangeManager * mExchangeMgr = nullptr;
     ExchangeACL * mExchangeACL     = nullptr;

This was most recently tested when applied to this rev:

commit ab665728d96a946ef3d6e2715784b7dbc0f6a3eb
Author: Andrei Litvin <[email protected]>
Date:   Sat Aug 21 10:49:52 2021 -0400
@msandstedt
Copy link
Contributor Author

@bzbarsky-apple
Copy link
Contributor

it can be assumed the derived mDispatch is also destroyed as a result

I think the intent is that the dispatch objects have some sort of "CHIP stack" lifetime, pretty much. That said, the SessionEstablishmentExchangeDispatch instances we have don't do that...

We need to either move to a setup where dispatches are in fact singletons with "CHIP stack" lifetime or we need to figure out how to clean them up properly.

That said, the quoted changes above aren't quite right either. Nulling things out in ~ExchangeContext doesn't make sense, and nulling things out in ExchangeContext::HandleMessage means we won't send acks on a closed non-application exchange, which is not really right...

We should add a test for this. Something where an exchange that was used to send the last message of a CASE or PASE handshake is still around when the ack for that handshake comes in, but the CASESession/PASESession has been destroyed...

@pan-apple

@msandstedt
Copy link
Contributor Author

it can be assumed the derived mDispatch is also destroyed as a result

I think the intent is that the dispatch objects have some sort of "CHIP stack" lifetime, pretty much. That said, the SessionEstablishmentExchangeDispatch instances we have don't do that...

We need to either move to a setup where dispatches are in fact singletons with "CHIP stack" lifetime or we need to figure out how to clean them up properly.

That said, the quoted changes above aren't quite right either. Nulling things out in ~ExchangeContext doesn't make sense, and nulling things out in ExchangeContext::HandleMessage means we won't send acks on a closed non-application exchange, which is not really right...

We should add a test for this. Something where an exchange that was used to send the last message of a CASE or PASE handshake is still around when the ack for that handshake comes in, but the CASESession/PASESession has been destroyed...

@pan-apple

@bzbarsky-apple ,

You seemed to understand the problem I had laid out pretty clearly. I was actually relaying this report from somebody else.

Do you know if this is still a problem? A lot of the code in this area has changed significantly for the better.

@bzbarsky-apple
Copy link
Contributor

I believe at this point the expectation is that the dispatch objects have infinite lifetime, so we can rely on them not dying. And the ones in the Matter SDK itself do that.

@turon
Copy link
Contributor

turon commented Feb 1, 2022

Definitely can't have any known leaks in v1, so removing tag v1_triage_split_4.
Close when verified.

@msandstedt
Copy link
Contributor Author

OK, we will review and see if we can justify closing this.

Thanks @turon and @bzbarsky-apple for the input.

@msandstedt msandstedt self-assigned this Feb 1, 2022
@msandstedt
Copy link
Contributor Author

Confirmed this is fixed with #12794.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
leak Memory leak bug V1.0
Projects
None yet
Development

No branches or pull requests

5 participants