-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't dial error locks up device #66
Comments
Can you do some experiments and see how many concurrent connections the device can support. And what happens when it exceeds the maximum? Does it handle it gracefully by returning HCI error, or the HCI channel just stuck or act abnormally? |
The BLE device on the gateway is based on an NRF52832 running a S132 soft device. Supposedly, according to Nordic, it supports up to 20 concurrent connections. There's just lot's of layers that could have a potential bug. (UART-BLE firmware, S132 soft device, this library) In my case, the HCI channel usually sends an error from the previous connect attempt, and then locks up on the next attempt. This morning though I got it to lock up first thing without any previous errors with only 3 devices connecting. Trying to reproduce the error has been tricky. I have 6 devices up and running and haven't had it quit yet. Thanks for your insights Roy |
Sorry, I wasn't clear in the previous comment. I meant the device (a usb dongle) that runs the Go BLE library. For example, I had a USB dongle which only support one connection at a time. If I try to make a second concurrent connection, it returns HCI error, and the user(app) can retry later or disconnect the other connection first. I had another USB dongle which can support 2 concurrent connections. But if I try to make a 3rd concurrent connection, the device would just hang. The user(app) has no way to recover from it until reset. In this case, we need to add some logic in the app to make sure the concurrent connections never exceeds 2. An potentially easier way to test is to have a couple of usb dongles plugged to your laptop. hci0 makes connections to hci1, hci2, ... and find out how many connections hci0 can support. |
Got it. I am trying that a bit right now. One thing I'm noticing is that this lock-up happens after a scan during the dial phase. It's as if the context is pulling the rug out from underneath the dial phase and its leaving it in a weird state. My guess is if I make the context a bit longer (right now it's 15 seconds) I may run into this problem much less. |
Looking into it a bit more, I am subscribing to a handful of characteristics per device. That's actually done before connecting to subsequent devices. Because of the async nature, I think they may be causing an issue with the following connect or re-connect attempts. Disabling all the subscriptions and then re-enabling after all devices have re-connected seems to be resolving at least part of the issue. |
Background:
I have written some code which connects to multiple BLE devices and subscribes to a handful characteristics. Should the connection break or fail at any time, the code periodically checks to see if that client has been disconnected and attempts to reconnect.
The Problem:
When connecting using the snippet below, the device occasionally locks up after the ble.Connect command. The context does not return no matter what. My delay is a 15 second duration.
What's important to mention here is that the lock only happens when I receive a "can't dial" error. That's usually the sign of death and it never returns after its next attempt.
It also seems to happen with more devices. With 3 devices it happens sparingly and with 6 devices it happens much more often.
Expected Behavior:
I expect the device to cleanly connect or disconnect and proceed to the next chunk of code. (or return and retry depending on what happens)
Observed Behavior:
Here's a log right before the lockup. Pay attention to where it says
Connecting...
It does not return after the last line. Other go routines though continue to run in the background in the same program.
Note:
can't discover profile: can't discover services: ATT request failed: input channel closed: io: read/write on closed pipe
Doesn't always happen every time. Sometimes, I'll go to connect, get thecan't dial
error and then it fails right there.The text was updated successfully, but these errors were encountered: