Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python based device scanning interfacing with native C++ code on linux. #4960

Merged
merged 17 commits into from
Feb 25, 2021

Conversation

andy31415
Copy link
Contributor

@andy31415 andy31415 commented Feb 22, 2021

Problem

Native C++ device scanning is not implemented nor accessible from python, which does not allow us to verify/script the parsing of discovered devices.

Current discovery is done natively by Android and iOS platforms and a pure python version. However connectivity itself is done by C++ native code which would only be exercised on connect and never on discovery.

Summary of Changes

This is a merge of several branches developed while the bluez code split was under review. It contains:

  • Moved BlueZ main loop to a separate class (rather than inline method) so that it is accessible by other implementations
  • Implemented a 'CHIPDeviceScanner' that can be used both externally and is used internally by the 'connect via BLE' implementation in linux BLEManager
  • Added native handle classes for python for managing native interface calls and doing 'init' for platform-specific stack bits (e.g. memory and device layer initialization required by BLE)
  • Added bindings to the device scanner from python
  • Added more type safety to BlueZ code (less void *, replaced with actual types and avoided casting)

Example usage:

In [1]: import chip.ble

In [2]: chip.ble.DiscoverAsync(2000, lambda *x: print('Discovered: %r' % (x,)), lambda: print('Done'))
CHIP:DL: Platform main loop started.
CHIP:DL: TRACE: Bluez mainloop starting Thread
CHIP:BLE: BLE scanning through known devices.
CHIP:BLE: BLE initiating scan.

In [3]: CHIP:BLE: Device 38:CB:9F:C4:D5:7A does not look like a CHIP device.
CHIP:BLE: Device 79:1C:AB:8D:B4:1E does not look like a CHIP device.
CHIP:BLE: Device 6E:91:AB:C1:8B:75 does not look like a CHIP device.
CHIP:BLE: Device 53:DA:56:14:DF:2E does not look like a CHIP device.
Discovered: (b'98:F4:AB:6B:1A:FE', 3840, 65279, 9050)
Done
In [3]:

Moved Bluez mainloop into a separate class
Moved blezl iterators into separate files
Added ble generic initialization and handle management
Switched bluez to use the newly developed scanner main loop.
@todo

This comment has been minimized.

@todo
Copy link

todo bot commented Feb 22, 2021

Factor common code in the two function.

// TODO: Factor common code in the two function.
BluezConnection * conn;
VerifyOrExit(bluez_device1_get_connected(device), ChipLogError(DeviceLayer, "FAIL: device is not connected"));
conn = static_cast<BluezConnection *>(
g_hash_table_lookup(apEndpoint->mpConnMap, g_dbus_proxy_get_object_path(G_DBUS_PROXY(device))));
VerifyOrExit(conn == nullptr,
ChipLogError(DeviceLayer, "FAIL: connection already tracked: conn: %x new device: %s", conn,
g_dbus_proxy_get_object_path(G_DBUS_PROXY(device))));
conn = g_new0(BluezConnection, 1);


This comment was generated by todo based on a TODO comment in 67b1525 in #4960. cc @andy31415.

@project-chip project-chip deleted a comment from todo bot Feb 22, 2021
g_free(mBluezMainLoop);
mBluezMainLoop = nullptr;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cleanup logic looks suspicious. err will always equal CHIP_NO_ERROR, so this block can never execute. But if it was to execute, shouldn't the pthread also be cleaned up?

Copy link
Contributor Author

@andy31415 andy31415 Feb 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - I did not notice no error was ever set; thank you for catching.


private:
ChipDeviceScanner::Ptr mScanner;
PyObject * mContext;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyObject * const mContext ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Made me not be able to set it to nullptr in destructor (I was trying too set everything to null to catch any odd reference errors while debugging what turned out to be python void* cast issues).


~ScannerDelegateImpl()
{
mContext = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of assigning in the dtor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debugging invalid memory references. Turned out to be python c_void_p behavind oddly (it casts to 32-bit if it can, but when passing back it stays 32-bit int and does a sign extension or invalid high order bits when sent as a function argument ... very frustrating)

src/controller/python/chip/ble/types.py Outdated Show resolved Hide resolved
src/controller/python/chip/ble/scan_devices.py Outdated Show resolved Hide resolved
src/controller/python/chip/native/StackInit.cpp Outdated Show resolved Hide resolved
{
if (manager == nullptr)
{
ChipLogError(DeviceLayer, "Manager is NULL in %s", __func__);
Copy link
Contributor

@mspang mspang Feb 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be an assert (manager != nullptr) ?

In general, I don't think "internal pointer is null" is ever an appropriate log message (except perhaps when debugging locally).

We should instead log something that describes the cause of the pointer being null, or not get into here in the first place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just moved this code (copied and pasted as is). Agree it would require some more passes, but I am afraid of going to deep if trying to update everything. The PR is already huge.

namespace Internal {
namespace {

struct GObjectUnref
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use this in other cases where we're manually calling g_object_unref?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually I assume yes. I think it should be pulled out more globally as we change this code more.

I wonder though how much we will still touch it: I wanted to have scanning and this PR has it. The only additonal thing we may want is BLE connection by address instead of discriminator.

src/platform/Linux/bluez/ChipDeviceScanner.h Outdated Show resolved Hide resolved
src/platform/Linux/bluez/MainLoop.cpp Outdated Show resolved Hide resolved
BluezObjectsCleanup(endpoint);
if (!MainLoop::Instance().SetCleanupFunction(BluezObjectsCleanup, endpoint))
{
ChipLogError(DeviceLayer, "Failed to schedule cleanup function");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this an assert?

Printing a message when things go badly off the rails isn't going to catch issues quickly and reliably (e.g., cause our tests to fail).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may require additional passes: as it stands, we do not seem to ever stop the main loop, so I kept the cleanup function because cleanup existed in previous code, however this really does not get called in the current code structure.

… library handle pointers to not use c_void_p directly. This looks stable on my linux machine (ran >100 scans)
}

private:
std::unique_ptr<ChipDeviceScanner> mScanner;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: const

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot do this because scanner is moved post object creation and I have a SetScanner method.

This is a side-effect of creation order: delegate needs to access the scanner, but scanner needs a delegate.

src/controller/python/chip/ble/LinuxImpl.cpp Outdated Show resolved Hide resolved
src/controller/python/chip/ble/LinuxImpl.cpp Outdated Show resolved Hide resolved
src/controller/python/chip/ble/library_handle.py Outdated Show resolved Hide resolved
src/platform/Linux/bluez/MainLoop.cpp Outdated Show resolved Hide resolved
@andy31415 andy31415 merged commit 5ca7308 into project-chip:master Feb 25, 2021
@andy31415 andy31415 deleted the python_ble_discovery_large branch October 28, 2021 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants