Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StandaloneAck handling: Untangle code path for UMH and StandaloneAck #3

Closed

Conversation

kghost
Copy link

@kghost kghost commented Jun 10, 2022

Problem

The original implementation is too tricky

Change overview

  • Untangle code path for UMH and StandaloneAck
    • Revert UMH path, but only for UMH != nullptr case
    • Add a new branch for generating StandaloneAck
  • Add StandaloneExchangeDispatch for helping generating StandaloneAck
  • Add kFlagStandaloneExchange flag for exchange, to help go through StandaloneAck processing

Although the code is a bit tedious, but the logic is more clear, and more robust.

Testing

keithmwheeler and others added 19 commits June 9, 2022 12:57
* Run media cluster tests in tv-app in CI.

all-clusters-app does not have the media clusters configured in a
spec-compliant way (e.g. is missing mandatory things), and definitely
does not have them implemented in a spec-compliant way.

* Address review comments.
* Added Manual Scripts

* Added Auto generated files

* Restyled by whitespace

* Restyled by clang-format

* Added Auto generated file

* Restyled by clang-format

Co-authored-by: Restyled.io <[email protected]>
…ect-chip#19229)

* code review comments. Fix mCredentials, fix NVM flash saving

* restyle

* cleanup

* remove unused array

* remove TotalCredentials key

* Check mMaxUsers against DOOR_LOCK_MAX_USERS
…19385)

* [Zephyr] Support heap watermark for non-default malloc

When non-default malloc implementation, based on Zephyr's
sys_heap is used, support CurrentHeapHighWatermark attribute
and ResetWatermarks command. This is because sys_heap
exposes max_allocated_bytes statistic.

Signed-off-by: Damian Krolik <[email protected]>

* Restyled by clang-format

Co-authored-by: Restyled.io <[email protected]>
#### Problem

The libevent-based implementation of `System::Layer` is unused, untested
in CI, and probably stale. (It was originally written as a sanity check on
System::Layer APIs, chosen as a representative substrate independent of
any active platform.)

#### Change overview

Remove `SystemLayerImplLibevent`.

#### Testing

CI to confirm build edits.
* PostEvents now called from ScheduleWork

* Fixed whitespace from restyle
* doc: nrf_connect: rework OT RCP dongle guide

Edited the nRF Connect OpenThread RCP dongle guide.
Made minor style edits to OT border router page.

Signed-off-by: Grzegorz Ferenc <[email protected]>

* Restyled by prettier-markdown

Co-authored-by: Restyled.io <[email protected]>
* invert D2 and D3 LED states;
* return correct error codes from KVS.

Signed-off-by: Doru Gucea <[email protected]>
…roject-chip#19373)

* [project-chip#19316] Don't require PIN for locking/unlocking the door when not required

* Update auto-generated files
…-chip#19153)

- A newly created CMAKE script allows generating factory data JSON and
.hex files during the building.
- Created .hex file containing factory data partition can be merged with the firmware.
- The whole process is controlled by kconfigs created for nrfconnect platform.
- The CHIPDevicePlatformConfig definitions connected with factory data were updated
according to created kconfigs.
- Using proper kconfigs it is possible to generate the SPAKE2 passcode verifier and
Rotating Device ID Unique ID during building.
- Removed Certificate Declaration from factory data.
- Changed manufacturing date fromat to ISO 8601
- There is also the possibility to use the default values for development purposes.
Enabling HDLC & RPC causes a significant increase in logs verbosity on
ESP32 which may cause bottlenecks (even causing commissioning to fail).

Respect the maximum log level to suppress these extra logs.

It would be more efficient to check this at the logging call sites, but
using a platform specific level limit is not easily supported with the
matter logging stack (instead, there are separate build time toggles
such as CHIP_DETAIL_LOGGING).
…ect-chip#19321)

Update TestListStructOctet to use member names that are generic
…D. (project-chip#18815)

* Groups cluster: EMBER_ZCL_STATUS_INVALID_VALUE replaced with EMBER_ZCL_STATUS_CONSTRAINT_ERROR.

* Groups cluster: Rebased. Code re-generated.

* Groups cluster: code generated.
… at init time (project-chip#19261)

* Initial commit.

* Review feedback

* Review feedback

* Minor integer promotion issue in a test

* Linux build fix
* Changes to support the bridge test plan.

Added single character commands to the bridge-app to support the
bridge test plan. The following commands are supported:

2 (TC-BR-2) Light 2 is added
4 (TC-BR-4) Light 1 is removed
5 (TC-BR-5) Light 1 is added back
b (TC-BR-3 step 1b) Present devices are renamed
c (TC-BR-3 step 2c) State  of Light 1 and Light 2 are changed.
Z

* Progress for aggregator changes.

* Changed the example/bridge app have the bridge on a non-zero endpoint ID.

The example/bridge-app was updated to use a non-zero endpoint-ID.
The bridge/aggregator is now on endpoint 1. All endpoint IDs show on both
endpoint ID 0, and the dynamic deivces added to the bridge show on the
bridge endpoint ID 1 as well.

* Removed fixed label from bridge-app example.

The funcion of the fixed label is replaced by the actions cluster,
so the fixed label was removed.

* Update examples/bridge-app/linux/bridged-actions-stub.cpp

Co-authored-by: Boris Zbarsky <[email protected]>

* Added Restyled changes.

* Resolved build failures for bridge-app

  - removed fixed label support from esp32 bridge-app
  - added include for identify-server.h
  - added callbacks for bridged-actions

* Fixed restyled whitespace.

Co-authored-by: Boris Zbarsky <[email protected]>
src/messaging/ExchangeMgr.cpp Show resolved Hide resolved
src/messaging/ExchangeContext.cpp Outdated Show resolved Hide resolved
ExchangeMessageDispatch & ExchangeContext::GetMessageDispatch(bool isStandaloneExchange, ExchangeDelegate * delegate)
{
if (isStandaloneExchange)
return StandaloneExchangeDispatch::Instance();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was trying to avoid the codesize hit of yet another message dispatch.... But it would be OK to do this, modulo the other problems. That said, what is the benefit? For an exchange with no delegate (which is what a "standalone exchange" is, the dispatch is never even consulted except for the test-only IsReliableTransmissionAllowed test.

bzbarsky-apple and others added 2 commits June 10, 2022 02:34
…-chip#19398)

There were two ways we could fail to send an ack to an incoming reliable message:

1) If we found no matching handler, and hence created an ephemeral exchange to
handle the message, but the message was unencrypted.  In this case our ephemeral
exchange would return true for IsEncryptionRequired(), because it would default
to an ApplicationExchangeDispatch, and we would never call into
ExchangeContext::HandleMessage.

2) If ExchangeMessageDispatch::MessagePermitted returned false for the message.
In particular, for an ApplicationExchangeDispatch, this would happen for all the
handshake messages except StatusReport.

The fix for issue 1 is to ensure we always call into HandleMEssage if we manage
to allocate an exchange.  If there is an encryption mismatch, which only matters
when the exchange is non-ephemeral, we close the exchange first to prevent event
delivery to the delegate.

The fix for issue 2 is to move the MRP processing out of ExchangeMessageDispatch
and into ExchangeContext, and to move the MessagePermitted check so the only
thing it prevents is delivery of the message to the delegate, not any other
processing by the exchange.

Fixes project-chip#10515
@bzbarsky-apple
Copy link
Owner

This has now been spun out as project-chip#19443

bzbarsky-apple pushed a commit that referenced this pull request Jul 11, 2022
It's not safe to access line editing state from the IO thread while
inside readline() on the main thread.

Remove the code that attempts to redraw readline after printing logs.
This avoids segfaults during logging at the cost of those logs
overwriting the prompt (this is not trivial to fix as readline
is a blocking API).

==================
WARNING: ThreadSanitizer: data race (pid=63005)
  Write of size 1 at 0x55f81c7745ff by main thread:
    #0 InteractiveStartCommand::ParseCommand(char*) ../../examples/chip-tool/commands/interactive/InteractiveCommands.cpp:127 (chip-tool+0x874911)
    #1 InteractiveStartCommand::RunCommand() ../../examples/chip-tool/commands/interactive/InteractiveCommands.cpp:85 (chip-tool+0x874594)
    #2 CHIPCommand::StartWaiting(std::chrono::duration<unsigned int, std::ratio<1l, 1000l> >) ../../examples/chip-tool/commands/common/CHIPCommand.cpp:408 (chip-tool+0x83e478)
    #3 CHIPCommand::Run() ../../examples/chip-tool/commands/common/CHIPCommand.cpp:187 (chip-tool+0x83c839)
    #4 Commands::RunCommand(int, char**, bool) ../../examples/chip-tool/commands/common/Commands.cpp:147 (chip-tool+0x85d4f7)
    #5 Commands::Run(int, char**) ../../examples/chip-tool/commands/common/Commands.cpp:51 (chip-tool+0x85c288)
    #6 main <null> (chip-tool+0x569c0a)

  Previous read of size 1 at 0x55f81c7745ff by thread T5 (mutexes: write M185):
    #0 LoggingCallback ../../examples/chip-tool/commands/interactive/InteractiveCommands.cpp:46 (chip-tool+0x874479)
    #1 chip::Logging::LogV(unsigned char, unsigned char, char const*, __va_list_tag*) ../../src/lib/support/logging/CHIPLogging.cpp:221 (chip-tool+0x8ee4dc)
    #2 chip::Logging::Log(unsigned char, unsigned char, char const*, ...) ../../src/lib/support/logging/CHIPLogging.cpp:172 (chip-tool+0x8ee30a)
    #3 chip::app::ReadClient::RefreshLivenessCheckTimer() <null> (chip-tool+0x8b1746)
    #4 chip::app::ReadClient::ProcessSubscribeResponse(chip::System::PacketBufferHandle&&) ../../src/app/ReadClient.cpp:845 (chip-tool+0x8b20ec)
    #5 chip::app::ReadClient::OnMessageReceived(chip::Messaging::ExchangeContext*, chip::PayloadHeader const&, chip::System::PacketBufferHandle&&) ../../src/app/ReadClient.cpp:409 (chip-tool+0x8ae2a4)
    #6 chip::Messaging::ExchangeContext::HandleMessage(unsigned int, chip::PayloadHeader const&, chip::BitFlags<chip::Messaging::MessageFlagValues, unsigned int>, chip::System::PacketBufferHandle&&) <null> (chip-tool+0xa0517a)
    #7 operator()<chip::Messaging::ExchangeContext> ../../src/messaging/ExchangeMgr.cpp:219 (chip-tool+0xa08c73)
    #8 Call ../../src/lib/support/Pool.h:126 (chip-tool+0xa0912d)
    #9 chip::internal::HeapObjectList::ForEachNode(void*, chip::Loop (*)(void*, void*)) ../../src/lib/support/Pool.cpp:127 (chip-tool+0x8ee05a)
    #10 ForEachActiveObject<chip::Messaging::ExchangeManager::OnMessageReceived(const chip::PacketHeader&, const chip::PayloadHeader&, const chip::SessionHandle&, chip::SessionMessageDelegate::DuplicateMessage, chip::System::PacketBufferHandle&&)::<lambda(auto:2*)> > ../../src/lib/support/Pool.h:396 (chip-tool+0xa08d10)
    #11 chip::Messaging::ExchangeManager::OnMessageReceived(chip::PacketHeader const&, chip::PayloadHeader const&, chip::SessionHandle const&, chip::SessionMessageDelegate::DuplicateMessage, chip::System::PacketBufferHandle&&) ../../src/messaging/ExchangeMgr.cpp:212 (chip-tool+0xa07e91)
    #12 chip::SessionManager::SecureUnicastMessageDispatch(chip::PacketHeader const&, chip::Transport::PeerAddress const&, chip::System::PacketBufferHandle&&) ../../src/transport/SessionManager.cpp:616 (chip-tool+0xa1548b)
    #13 chip::SessionManager::OnMessageReceived(chip::Transport::PeerAddress const&, chip::System::PacketBufferHandle&&) ../../src/transport/SessionManager.cpp:443 (chip-tool+0xa14426)
    #14 chip::TransportMgrBase::HandleMessageReceived(chip::Transport::PeerAddress const&, chip::System::PacketBufferHandle&&) ../../src/transport/TransportMgrBase.cpp:76 (chip-tool+0xa17dfa)
    #15 chip::Transport::Base::HandleMessageReceived(chip::Transport::PeerAddress const&, chip::System::PacketBufferHandle&&) ../../src/transport/raw/Base.h:102 (chip-tool+0xb19728)
    #16 chip::Transport::UDP::OnUdpReceive(chip::Inet::UDPEndPoint*, chip::System::PacketBufferHandle&&, chip::Inet::IPPacketInfo const*) ../../src/transport/raw/UDP.cpp:122 (chip-tool+0xb1a48b)
    #17 chip::Inet::UDPEndPointImplSockets::HandlePendingIO(chip::BitFlags<chip::System::SocketEventFlags, unsigned char>) ../../src/inet/UDPEndPointImplSockets.cpp:688 (chip-tool+0xb00aa0)
    #18 chip::Inet::UDPEndPointImplSockets::HandlePendingIO(chip::BitFlags<chip::System::SocketEventFlags, unsigned char>, long) ../../src/inet/UDPEndPointImplSockets.cpp:569 (chip-tool+0xafff89)
    #19 chip::System::LayerImplSelect::HandleEvents() ../../src/system/SystemLayerImplSelect.cpp:406 (chip-tool+0xb07563)
    #20 chip::DeviceLayer::Internal::GenericPlatformManagerImpl_POSIX<chip::DeviceLayer::PlatformManagerImpl>::_RunEventLoop() ../../src/include/platform/internal/GenericPlatformManagerImpl_POSIX.ipp:181 (chip-tool+0x98a227)
    #21 chip::DeviceLayer::PlatformManager::RunEventLoop() ../../src/include/platform/PlatformManager.h:362 (chip-tool+0x988f75)
    #22 chip::DeviceLayer::Internal::GenericPlatformManagerImpl_POSIX<chip::DeviceLayer::PlatformManagerImpl>::EventLoopTaskMain(void*) ../../src/include/platform/internal/GenericPlatformManagerImpl_POSIX.ipp:205 (chip-tool+0x98a87c)

  Location is global '(anonymous namespace)::gIsCommandRunning' of size 1 at 0x55f81c7745ff (chip-tool+0x000000c485ff)

  Mutex M185 (0x55f81c776180) created at:
    #0 pthread_mutex_lock ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:4240 (libtsan.so.0+0x4f30a)
    #1 chip::DeviceLayer::Internal::GenericPlatformManagerImpl_POSIX<chip::DeviceLayer::PlatformManagerImpl>::_LockChipStack() ../../src/include/platform/internal/GenericPlatformManagerImpl_POSIX.ipp:78 (chip-tool+0x989e90)
    #2 chip::DeviceLayer::PlatformManager::LockChipStack() ../../src/include/platform/PlatformManager.h:410 (chip-tool+0x988fa5)
    #3 chip::DeviceLayer::Internal::GenericPlatformManagerImpl_POSIX<chip::DeviceLayer::PlatformManagerImpl>::_RunEventLoop() ../../src/include/platform/internal/GenericPlatformManagerImpl_POSIX.ipp:170 (chip-tool+0x98a147)
    #4 chip::DeviceLayer::PlatformManager::RunEventLoop() ../../src/include/platform/PlatformManager.h:362 (chip-tool+0x988f75)
    #5 chip::DeviceLayer::Internal::GenericPlatformManagerImpl_POSIX<chip::DeviceLayer::PlatformManagerImpl>::EventLoopTaskMain(void*) ../../src/include/platform/internal/GenericPlatformManagerImpl_POSIX.ipp:205 (chip-tool+0x98a87c)

  Thread T5 (tid=63013, running) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:969 (libtsan.so.0+0x5ad75)
    #1 chip::DeviceLayer::Internal::GenericPlatformManagerImpl_POSIX<chip::DeviceLayer::PlatformManagerImpl>::_StartEventLoopTask() ../../src/include/platform/internal/GenericPlatformManagerImpl_POSIX.ipp:231 (chip-tool+0x98a40a)
    #2 chip::DeviceLayer::PlatformManager::StartEventLoopTask() ../../src/include/platform/PlatformManager.h:375 (chip-tool+0xaacca2)
    #3 chip::Controller::DeviceControllerFactory::ServiceEvents() ../../src/controller/CHIPDeviceControllerFactory.cpp:331 (chip-tool+0xab0417)
    #4 CHIPCommand::StartWaiting(std::chrono::duration<unsigned int, std::ratio<1l, 1000l> >) ../../examples/chip-tool/commands/common/CHIPCommand.cpp:403 (chip-tool+0x83e353)
    #5 CHIPCommand::Run() ../../examples/chip-tool/commands/common/CHIPCommand.cpp:187 (chip-tool+0x83c839)
    #6 Commands::RunCommand(int, char**, bool) ../../examples/chip-tool/commands/common/Commands.cpp:147 (chip-tool+0x85d4f7)
    #7 Commands::Run(int, char**) ../../examples/chip-tool/commands/common/Commands.cpp:51 (chip-tool+0x85c288)
    #8 main <null> (chip-tool+0x569c0a)

SUMMARY: ThreadSanitizer: data race ../../examples/chip-tool/commands/interactive/InteractiveCommands.cpp:127 in InteractiveStartCommand::ParseCommand(char*)
==================
bzbarsky-apple pushed a commit that referenced this pull request Jan 24, 2023
* [matter_yamltests] Make definitions.py more strict about casing

* Fix various typos in src/app/tests/suites/ yaml tests (#3 attempt)
bzbarsky-apple added a commit that referenced this pull request May 25, 2023
The failure looks like this:

  WARNING: ThreadSanitizer: race on NSMutableArray (pid=11619)
    Read-only access of NSMutableArray at 0x7b0c0005f5b0 by thread T3:
      #0 -[__NSArrayM countByEnumeratingWithState:objects:count:] <null>:2 (CoreFoundation:x86_64+0x4a338)
      #1 -[MTRDeviceControllerFactory(InternalMethods) operationalInstanceAdded:] MTRDeviceControllerFactory.mm:855 (Matter:x86_64+0x1fd2a)
      #2 MTROperationalBrowser::OnBrowse(_DNSServiceRef_t*, unsigned int, unsigned int, int, char const*, char const*, char const*, void*) MTROperationalBrowser.mm:100 (Matter:x86_64+0x20ee63c)
      #3 handle_browse_response <null>:2 (libsystem_dnssd.dylib:x86_64+0x3733)
      #4 _dispatch_client_callout <null>:2 (libdispatch.dylib:x86_64+0x3316)

    Previous modifying access of NSMutableArray at 0x7b0c0005f5b0 by main thread:
      #0 -[__NSArrayM addObject:] <null>:2 (CoreFoundation:x86_64+0x2457a)
      #1 -[MTRDeviceControllerFactory createController] MTRDeviceControllerFactory.mm:719 (Matter:x86_64+0x1cee3)
      #2 -[MTRDeviceControllerFactory createControllerOnExistingFabric:error:] MTRDeviceControllerFactory.mm:534 (Matter:x86_64+0x19792)

The basic problem is that we are in the middle of adding an object to
_controllers on the API consumer thread when on the Matter thread we get our
browse notification.

The changes here don't aim to lock around all access to _controllers, but just
to make sure that our mutations of it can't race with the access on the Matter
thread.  More coarse locking would need to be done very carefully, given the
amount of dispath_sync to the Matter thread we have going on.
bzbarsky-apple added a commit that referenced this pull request May 25, 2023
The failure looks like this:

  WARNING: ThreadSanitizer: race on NSMutableArray (pid=11619)
    Read-only access of NSMutableArray at 0x7b0c0005f5b0 by thread T3:
      #0 -[__NSArrayM countByEnumeratingWithState:objects:count:] <null>:2 (CoreFoundation:x86_64+0x4a338)
      #1 -[MTRDeviceControllerFactory(InternalMethods) operationalInstanceAdded:] MTRDeviceControllerFactory.mm:855 (Matter:x86_64+0x1fd2a)
      #2 MTROperationalBrowser::OnBrowse(_DNSServiceRef_t*, unsigned int, unsigned int, int, char const*, char const*, char const*, void*) MTROperationalBrowser.mm:100 (Matter:x86_64+0x20ee63c)
      #3 handle_browse_response <null>:2 (libsystem_dnssd.dylib:x86_64+0x3733)
      #4 _dispatch_client_callout <null>:2 (libdispatch.dylib:x86_64+0x3316)

    Previous modifying access of NSMutableArray at 0x7b0c0005f5b0 by main thread:
      #0 -[__NSArrayM addObject:] <null>:2 (CoreFoundation:x86_64+0x2457a)
      #1 -[MTRDeviceControllerFactory createController] MTRDeviceControllerFactory.mm:719 (Matter:x86_64+0x1cee3)
      #2 -[MTRDeviceControllerFactory createControllerOnExistingFabric:error:] MTRDeviceControllerFactory.mm:534 (Matter:x86_64+0x19792)

The basic problem is that we are in the middle of adding an object to
_controllers on the API consumer thread when on the Matter thread we get our
browse notification.

The changes here don't aim to lock around all access to _controllers, but just
to make sure that our mutations of it can't race with the access on the Matter
thread.  More coarse locking would need to be done very carefully, given the
amount of dispath_sync to the Matter thread we have going on.
bzbarsky-apple added a commit that referenced this pull request May 25, 2023
The failure looks like this:

  WARNING: ThreadSanitizer: race on NSMutableArray (pid=11619)
    Read-only access of NSMutableArray at 0x7b0c0005f5b0 by thread T3:
      #0 -[__NSArrayM countByEnumeratingWithState:objects:count:] <null>:2 (CoreFoundation:x86_64+0x4a338)
      #1 -[MTRDeviceControllerFactory(InternalMethods) operationalInstanceAdded:] MTRDeviceControllerFactory.mm:855 (Matter:x86_64+0x1fd2a)
      #2 MTROperationalBrowser::OnBrowse(_DNSServiceRef_t*, unsigned int, unsigned int, int, char const*, char const*, char const*, void*) MTROperationalBrowser.mm:100 (Matter:x86_64+0x20ee63c)
      #3 handle_browse_response <null>:2 (libsystem_dnssd.dylib:x86_64+0x3733)
      #4 _dispatch_client_callout <null>:2 (libdispatch.dylib:x86_64+0x3316)

    Previous modifying access of NSMutableArray at 0x7b0c0005f5b0 by main thread:
      #0 -[__NSArrayM addObject:] <null>:2 (CoreFoundation:x86_64+0x2457a)
      #1 -[MTRDeviceControllerFactory createController] MTRDeviceControllerFactory.mm:719 (Matter:x86_64+0x1cee3)
      #2 -[MTRDeviceControllerFactory createControllerOnExistingFabric:error:] MTRDeviceControllerFactory.mm:534 (Matter:x86_64+0x19792)

The basic problem is that we are in the middle of adding an object to
_controllers on the API consumer thread when on the Matter thread we get our
browse notification.

The changes here don't aim to lock around all access to _controllers, but just
to make sure that our mutations of it can't race with the access on the Matter
thread.  More coarse locking would need to be done very carefully, given the
amount of dispath_sync to the Matter thread we have going on.
bzbarsky-apple added a commit that referenced this pull request May 26, 2023
* Fix ThreadSanitizer failure in controller factory.

The failure looks like this:

  WARNING: ThreadSanitizer: race on NSMutableArray (pid=11619)
    Read-only access of NSMutableArray at 0x7b0c0005f5b0 by thread T3:
      #0 -[__NSArrayM countByEnumeratingWithState:objects:count:] <null>:2 (CoreFoundation:x86_64+0x4a338)
      #1 -[MTRDeviceControllerFactory(InternalMethods) operationalInstanceAdded:] MTRDeviceControllerFactory.mm:855 (Matter:x86_64+0x1fd2a)
      #2 MTROperationalBrowser::OnBrowse(_DNSServiceRef_t*, unsigned int, unsigned int, int, char const*, char const*, char const*, void*) MTROperationalBrowser.mm:100 (Matter:x86_64+0x20ee63c)
      #3 handle_browse_response <null>:2 (libsystem_dnssd.dylib:x86_64+0x3733)
      #4 _dispatch_client_callout <null>:2 (libdispatch.dylib:x86_64+0x3316)

    Previous modifying access of NSMutableArray at 0x7b0c0005f5b0 by main thread:
      #0 -[__NSArrayM addObject:] <null>:2 (CoreFoundation:x86_64+0x2457a)
      #1 -[MTRDeviceControllerFactory createController] MTRDeviceControllerFactory.mm:719 (Matter:x86_64+0x1cee3)
      #2 -[MTRDeviceControllerFactory createControllerOnExistingFabric:error:] MTRDeviceControllerFactory.mm:534 (Matter:x86_64+0x19792)

The basic problem is that we are in the middle of adding an object to
_controllers on the API consumer thread when on the Matter thread we get our
browse notification.

The changes here don't aim to lock around all access to _controllers, but just
to make sure that our mutations of it can't race with the access on the Matter
thread.  More coarse locking would need to be done very carefully, given the
amount of dispath_sync to the Matter thread we have going on.

* Address review comments.
bzbarsky-apple added a commit that referenced this pull request Oct 9, 2023
…ist".

The typical failure there looks like this:

==29620==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 88 byte(s) in 1 object(s) allocated from:
    #0 0x106396e12 in calloc+0xa2 (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x51e12)
    #1 0x7ff800dc9789 in map_images_nolock+0x24b (libobjc.A.dylib:x86_64h+0x1789)
    #2 0x7ff800dc94db in map_images+0x42 (libobjc.A.dylib:x86_64h+0x14db)
    #3 0x113d721fa in invocation function for block in dyld4::RuntimeState::setObjCNotifiers(void (*)(unsigned int, char const* const*, mach_header const* const*), void (*)(char const*, mach_header const*), void (*)(char const*, mach_header const*))+0x112 (dyld:x86_64+0xf1fa)
    #4 0x113d6d6c8 in dyld4::RuntimeState::withLoadersReadLock(void () block_pointer)+0x28 (dyld:x86_64+0xa6c8)
    #5 0x113d720e1 in dyld4::RuntimeState::setObjCNotifiers(void (*)(unsigned int, char const* const*, mach_header const* const*), void (*)(char const*, mach_header const*), void (*)(char const*, mach_header const*))+0x51 (dyld:x86_64+0xf0e1)
    #6 0x113d85d44 in dyld4::APIs::_dyld_objc_notify_register(void (*)(unsigned int, char const* const*, mach_header const* const*), void (*)(char const*, mach_header const*), void (*)(char const*, mach_header const*))+0x4e (dyld:x86_64+0x22d44)
    #7 0x7ff800dc9343 in _objc_init+0x4fe (libobjc.A.dylib:x86_64h+0x1343)
    #8 0x7ff800d83992 in _os_object_init+0xc (libdispatch.dylib:x86_64+0x2992)
    #9 0x7ff800d911b7 in libdispatch_init+0x136 (libdispatch.dylib:x86_64+0x101b7)
    #10 0x7ff80bd34894 in libSystem_initializer+0xed (libSystem.B.dylib:x86_64+0x1894)
    #11 0x113d77e4e in invocation function for block in dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const+0xb5 (dyld:x86_64+0x14e4e)
    #12 0x113d9eaac in invocation function for block in dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const+0xf1 (dyld:x86_64+0x3baac)
    #13 0x113d95e25 in invocation function for block in dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const+0x22c (dyld:x86_64+0x32e25)
    #14 0x113d64db2 in dyld3::MachOFile::forEachLoadCommand(Diagnostics&, void (load_command const*, bool&) block_pointer) const+0x80 (dyld:x86_64+0x1db2)
    #15 0x113d95bb6 in dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const+0xb2 (dyld:x86_64+0x32bb6)
    #16 0x113d9e603 in dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const+0x1d1 (dyld:x86_64+0x3b603)
    #17 0x113d77d81 in dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const+0x8f (dyld:x86_64+0x14d81)
    #18 0x113d7e659 in dyld4::PrebuiltLoader::runInitializers(dyld4::RuntimeState&) const+0x1d (dyld:x86_64+0x1b659)
    #19 0x113d8b76d in dyld4::APIs::runAllInitializersForMain()+0x25 (dyld:x86_64+0x2876d)
    #20 0x113d6938c in dyld4::prepare(dyld4::APIs&, dyld3::MachOAnalyzer const*)+0xd72 (dyld:x86_64+0x638c)
    #21 0x113d684e3 in start+0x183 (dyld:x86_64+0x54e3)
bzbarsky-apple added a commit that referenced this pull request Oct 9, 2023
…ist".

The typical failure there looks like this:

==29620==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 88 byte(s) in 1 object(s) allocated from:
    #0 0x106396e12 in calloc+0xa2 (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x51e12)
    #1 0x7ff800dc9789 in map_images_nolock+0x24b (libobjc.A.dylib:x86_64h+0x1789)
    #2 0x7ff800dc94db in map_images+0x42 (libobjc.A.dylib:x86_64h+0x14db)
    #3 0x113d721fa in invocation function for block in dyld4::RuntimeState::setObjCNotifiers(void (*)(unsigned int, char const* const*, mach_header const* const*), void (*)(char const*, mach_header const*), void (*)(char const*, mach_header const*))+0x112 (dyld:x86_64+0xf1fa)
    #4 0x113d6d6c8 in dyld4::RuntimeState::withLoadersReadLock(void () block_pointer)+0x28 (dyld:x86_64+0xa6c8)
    #5 0x113d720e1 in dyld4::RuntimeState::setObjCNotifiers(void (*)(unsigned int, char const* const*, mach_header const* const*), void (*)(char const*, mach_header const*), void (*)(char const*, mach_header const*))+0x51 (dyld:x86_64+0xf0e1)
    #6 0x113d85d44 in dyld4::APIs::_dyld_objc_notify_register(void (*)(unsigned int, char const* const*, mach_header const* const*), void (*)(char const*, mach_header const*), void (*)(char const*, mach_header const*))+0x4e (dyld:x86_64+0x22d44)
    #7 0x7ff800dc9343 in _objc_init+0x4fe (libobjc.A.dylib:x86_64h+0x1343)
    #8 0x7ff800d83992 in _os_object_init+0xc (libdispatch.dylib:x86_64+0x2992)
    #9 0x7ff800d911b7 in libdispatch_init+0x136 (libdispatch.dylib:x86_64+0x101b7)
    #10 0x7ff80bd34894 in libSystem_initializer+0xed (libSystem.B.dylib:x86_64+0x1894)
    #11 0x113d77e4e in invocation function for block in dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const+0xb5 (dyld:x86_64+0x14e4e)
    #12 0x113d9eaac in invocation function for block in dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const+0xf1 (dyld:x86_64+0x3baac)
    #13 0x113d95e25 in invocation function for block in dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const+0x22c (dyld:x86_64+0x32e25)
    #14 0x113d64db2 in dyld3::MachOFile::forEachLoadCommand(Diagnostics&, void (load_command const*, bool&) block_pointer) const+0x80 (dyld:x86_64+0x1db2)
    #15 0x113d95bb6 in dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const+0xb2 (dyld:x86_64+0x32bb6)
    #16 0x113d9e603 in dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const+0x1d1 (dyld:x86_64+0x3b603)
    #17 0x113d77d81 in dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const+0x8f (dyld:x86_64+0x14d81)
    #18 0x113d7e659 in dyld4::PrebuiltLoader::runInitializers(dyld4::RuntimeState&) const+0x1d (dyld:x86_64+0x1b659)
    #19 0x113d8b76d in dyld4::APIs::runAllInitializersForMain()+0x25 (dyld:x86_64+0x2876d)
    #20 0x113d6938c in dyld4::prepare(dyld4::APIs&, dyld3::MachOAnalyzer const*)+0xd72 (dyld:x86_64+0x638c)
    #21 0x113d684e3 in start+0x183 (dyld:x86_64+0x54e3)
bzbarsky-apple added a commit that referenced this pull request Oct 10, 2023
…ist". (project-chip#29666)

The typical failure there looks like this:

==29620==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 88 byte(s) in 1 object(s) allocated from:
    #0 0x106396e12 in calloc+0xa2 (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x51e12)
    #1 0x7ff800dc9789 in map_images_nolock+0x24b (libobjc.A.dylib:x86_64h+0x1789)
    #2 0x7ff800dc94db in map_images+0x42 (libobjc.A.dylib:x86_64h+0x14db)
    #3 0x113d721fa in invocation function for block in dyld4::RuntimeState::setObjCNotifiers(void (*)(unsigned int, char const* const*, mach_header const* const*), void (*)(char const*, mach_header const*), void (*)(char const*, mach_header const*))+0x112 (dyld:x86_64+0xf1fa)
    #4 0x113d6d6c8 in dyld4::RuntimeState::withLoadersReadLock(void () block_pointer)+0x28 (dyld:x86_64+0xa6c8)
    #5 0x113d720e1 in dyld4::RuntimeState::setObjCNotifiers(void (*)(unsigned int, char const* const*, mach_header const* const*), void (*)(char const*, mach_header const*), void (*)(char const*, mach_header const*))+0x51 (dyld:x86_64+0xf0e1)
    #6 0x113d85d44 in dyld4::APIs::_dyld_objc_notify_register(void (*)(unsigned int, char const* const*, mach_header const* const*), void (*)(char const*, mach_header const*), void (*)(char const*, mach_header const*))+0x4e (dyld:x86_64+0x22d44)
    #7 0x7ff800dc9343 in _objc_init+0x4fe (libobjc.A.dylib:x86_64h+0x1343)
    #8 0x7ff800d83992 in _os_object_init+0xc (libdispatch.dylib:x86_64+0x2992)
    #9 0x7ff800d911b7 in libdispatch_init+0x136 (libdispatch.dylib:x86_64+0x101b7)
    #10 0x7ff80bd34894 in libSystem_initializer+0xed (libSystem.B.dylib:x86_64+0x1894)
    #11 0x113d77e4e in invocation function for block in dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const+0xb5 (dyld:x86_64+0x14e4e)
    #12 0x113d9eaac in invocation function for block in dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const+0xf1 (dyld:x86_64+0x3baac)
    #13 0x113d95e25 in invocation function for block in dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const+0x22c (dyld:x86_64+0x32e25)
    #14 0x113d64db2 in dyld3::MachOFile::forEachLoadCommand(Diagnostics&, void (load_command const*, bool&) block_pointer) const+0x80 (dyld:x86_64+0x1db2)
    #15 0x113d95bb6 in dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const+0xb2 (dyld:x86_64+0x32bb6)
    #16 0x113d9e603 in dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const+0x1d1 (dyld:x86_64+0x3b603)
    #17 0x113d77d81 in dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const+0x8f (dyld:x86_64+0x14d81)
    #18 0x113d7e659 in dyld4::PrebuiltLoader::runInitializers(dyld4::RuntimeState&) const+0x1d (dyld:x86_64+0x1b659)
    #19 0x113d8b76d in dyld4::APIs::runAllInitializersForMain()+0x25 (dyld:x86_64+0x2876d)
    #20 0x113d6938c in dyld4::prepare(dyld4::APIs&, dyld3::MachOAnalyzer const*)+0xd72 (dyld:x86_64+0x638c)
    #21 0x113d684e3 in start+0x183 (dyld:x86_64+0x54e3)
bzbarsky-apple pushed a commit that referenced this pull request Feb 29, 2024
* Rerouted tracing macros to Darwin signposts

* Initial framework for logging scalar data event

* Handled the new metrics event changes in collector

* Modified VerifyOrExit macro to accept an optional metric key as third argument

* Removed direct use of chrono in metrics_event.h
Switched MTRMetrics to vend dictionary for metric keys

* Modified SuccessOrExit to optionally accept metric key

* Moved metric keys to separate header
Reworked metric_event names for more clarity

* Switched MATTER_TRACE_METRIC usage to MATTER_LOG_METRIC

* Restyle fixes

* Fixed unit tests

* Fixed build failure due to MetricEvent hidden inside tracing enabled

* Fixing one source of build error

* Fixing darwin build failure

* Code Review: Rename LogMetric to LogMetricEvent

* Code Review Suggestions:

1. Metric Macros take full string constants and no longer use
   preprocessor to prefix. Allows free flowing strings
2. Reworked MetricEvent class and documented
3. Handled LogEventMetric for Darwin, ESP32, Perfetto, JSON to account
   for all types
4. Removed timePoint from MetricEvent class. Timestamps and duration
   calculation is now responsibility for the handlers of the event
5. Reverted BUILD.gn in system to not break out SystemClock.h

* Code Review Feedback #2:

1. Added SuccessOrExitWithMetric and VerifyOrExitWithMetric
2. Cleaned up support .gn to remove dependedency on metrics

* Code Review Feedback #3:

1. Added ScopedMetricEvent to use RAII to track begin and end within a
   scope

* Code Review #4:

Reverted an accidental removal

* Added MTRMetricData to description as per review comment

* Restyler fixes

* Sample code of how Begin and End log metrics can be used

* Fixed compilation error when tracing is disabled

* Fixes for build failures when tracing is disabled

* Picked up code review suggestion accidently dropped

* Code Review Feedback:

1. Begin metric does not take value
2. Allow undefined value for metric
3. Misc other feedback

* Handle undefined value and error value

* Revert a comment change

* Review Feedback: Changed ScopedMetricEvent to capture error by reference

* Fixed another build failure

* Reverting usage of LOG_METRICS

* Review feedback: Fix incorrect documentation

* Code Review Feedback: Remove access to Value in MetricEvent to avoid incorrect access

* Restyler fixes

* Unregistering backend in Darwin shutdown

* Resytler fixes...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.