Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move hub Bluetooth can disconnect randomly while running a program #320

Closed
dlech opened this issue May 1, 2021 · 19 comments
Closed

Move hub Bluetooth can disconnect randomly while running a program #320

dlech opened this issue May 1, 2021 · 19 comments
Labels
bug Something isn't working hub: movehub Issues related to the LEGO BOOST Move hub software: pybricks-micropython Issues with Pybricks MicroPython firmware (or EV3 runtime) topic: bluetooth Issues involving bluetooth

Comments

@dlech
Copy link
Member

dlech commented May 1, 2021

As reported in #304 (comment), the Move hub Bluetooth seems to lock up occasionally somewhat similar to #306.

Symptoms are that Bluetooth becomes disconnected while running a program and pressing the hub button to stop the program works and makes the light blink (Bluetooth advertising) since the hub is now disconnected. Reconnection may be affected by #325, but that is a separate issue.

@dlech dlech added bug Something isn't working hub: movehub Issues related to the LEGO BOOST Move hub software: pybricks-micropython Issues with Pybricks MicroPython firmware (or EV3 runtime) labels May 1, 2021
@BertLindeman
Copy link

I assume this item to be a better one than #304 for this post

@dlech would you share the program you run (on the movehub?) for this issue?
So we can run identical programs.

On my movehub I tried the build 1020 and 1022.
I did not show the watchdog info though.
Both "freeze" like this:

  • printing stops
  • the sensor stops blinking
  • no change in hub light animation; like "program running" (blue fade-out/fade-in)
  • Bluetooth drops on the code website after timeout? About 4 minutes?
  • sensor starts blinking again

The last time it froze at iteration 356

My current program on the movehub:

#
# source Z:\home\bert\py\pybricks\issue\issue_#304_movehub_randint.py
#
from pybricks import version
print(version)

from pybricks.pupdevices import ColorSensor
from pybricks.parameters import Port
# from urandom import randint
from pybricks.hubs import MoveHub

hub = MoveHub()

_rand = hub.battery.voltage() + hub.battery.current()  # seed
print("Battery:", str(hub.battery.voltage()) + "mV", str(hub.battery.current()) + "mA at start")


# Return a random integer N such that a <= N <= b.
def randint(a, b):
    global _rand
    _rand = 75 * _rand % 65537  # Lehmer
    return _rand * (b - a + 1) // 65537 + a

lights = ColorSensor(Port.D).lights  # MoveHub has a fixed motor on Port.B

count = 0
print("") # start at a new line
while True:
    count += 1
    print("Start iteration", count, end="")
    for i in range(50):
        lights.on([randint(0, 100) for j in range(3)])
        print(".", end="")
    print("; ended")

@dlech
Copy link
Member Author

dlech commented May 1, 2021

would you share the program you run (on the movehub?) for this issue?

I would, but it is essentially already the same as the one you have posted. 😉

  • Bluetooth drops on the code website after timeout? About 4 minutes?

Interesting, I never waited this long. So it seems like Bluetooth isn't completely stuck then.

@BertLindeman
Copy link

Interesting, I never waited this long. So it seems like Bluetooth isn't completely stuck then.

Retired systemsprogrammer nowadays has the opportunity to wait some time 😄 .

Do you know if there is a 4 minutes or maybe 250 seconds timeout in the bluetooth stack?

@dlech
Copy link
Member Author

dlech commented May 1, 2021

Do you know if there is a 4 minutes or maybe 250 seconds timeout in the bluetooth stack?

Probably. We don't have any timeouts like that in our code that I recall.

@dlech
Copy link
Member Author

dlech commented May 1, 2021

I've had this running for hours today and haven't had the Bluetooth quit yet. Not sure what could be different between yesterday and today. The fact that Bert was able to reproduce the problem shows there is still some problem here. But I'm stuck until we can find a more consistent way to reproduce the problem.

@BertLindeman
Copy link

Most of the tests I run on win10 with Microsoft Edge - Versie 90.0.818.49 (Officiële build) (64-bits).
Will try tomorrow the same hub / color sensor on Linux Mint 20.1.

@BertLindeman
Copy link

Intermediate report. The test on Linux Mint is now running at iteration 18000 plus.
I Will let it run overnight and report further about it tomorrow.

@BertLindeman
Copy link

The first attempt on Linux Mint with Google Chrome ended, at iteration 134,915
probably due to a power low.

So second attempt with fresh reloaded batteries:

  • printing stops at iteration 18,235
  • color-sensor lights: no more animation, but steady light
  • Bluetooth keeps the connection
  • hub light: blue fade-in / fade-out, so running a program

This all seen at 9:37
Waited until 11:25 but no automatic Bluetooth disconnect occurred.
Clicked the Bluetooth disconnect button and indeed the connection dropped.
And the animation of the color sensor lights continues.

I had enough time to wait for a Bluetooth disconnect, as I am building the Landrover next to the Linux PC.

No idea how to qualify this behavior
.
Loose remark: On rare occasions, I see the printed lines miss a dot on the printed line.
Both on Windows and Linux.

@BertLindeman
Copy link

BertLindeman commented May 3, 2021

@dlech would a hang in pybricks-micropython/tests/basic/count_forever.py
on the move hub be correctly reported here? And maybe help?
The printing stops rather soon: count: 148

environment
Win10
Microsoft Edge Versie 90.0.818.51
('movehub', '3.0.0b6', '7b3d1b42 on 2021-05-01')

As seen before, but really clear here, there is data missing from the print.
Showing the complete print - see missing data at/near 32, 33, 69, 70

count: 0
count: 1
count: 2
count: 3
count: 4
count: 5
count: 6
count: 7
count: 8
count: 9
count: 10
count: 11
count: 12
count: 13
count: 14
count: 15
count: 16
count: 17
count: 18
count: 19
count: 20
count: 21
count: 22
count: 23
count: 24
count: 25
count: 26
count: 27
count: 28
count: 29
count: 30
count: 31
count: 3 34
count: 35
count: 36
count: 37
count: 38
count: 39
count: 40
count: 41
count: 42
count: 43
count: 44
count: 45
count: 46
count: 47
count: 48
count: 49
count: 50
count: 51
count: 52
count: 53
count: 54
count: 55
count: 56
count: 57
count: 58
count: 59
count: 60
count: 61
count: 62
count: 63
count: 64
count: 65
count: 66
count: 67
count: 68
c
 count: 71
count: 72
count: 73
count: 74
count: 75
count: 76
count: 77
count: 78
count: 79
count: 80
count: 81
count: 82
count: 83
count: 84
count: 85
count: 86
count: 87
count: 88
count: 89
count: 90
count: 91
count: 92
count: 93
count: 94
count: 95
count: 96
count: 97
count: 98
count: 99
count: 100
count: 101
count: 102
count: 103
count: 104

count 104 is the last printed data this time.

[EDIT] On Linux Mint printing stops at about 4500 and Bluetooth connection dropped at that moment.
(Same program, same hub)

@dlech
Copy link
Member Author

dlech commented May 3, 2021

If it just causes the Bluetooth to hang and the hub does not reboot itself, then yes this is the right place.

The missing data is most likely caused by losses beyond our firmware. We are using GATT "notifications" which make no guarantee that the data is actually received on the other end (related: #274). We would need a Bluetooth sniffer to tell for sure though.

@dlech
Copy link
Member Author

dlech commented May 3, 2021

Is the Linux test running in a virtual machine on the same hardware as Windows?

@BertLindeman
Copy link

We would need a Bluetooth sniffer to tell for sure though.

Do you mean like in

Could you do a Bluetooth capture when this happens?
bleak capture

The run on windows is relatively small.

To be sure I will reboot windows and re-try first.

@dlech
Copy link
Member Author

dlech commented May 3, 2021

Do you mean like in

Could you do a Bluetooth capture when this happens?
bleak capture

In this case, no. It could be that the hub is sending the data over the air, but Windows is not able to receive it due to background noise or something like that.

@dlech
Copy link
Member Author

dlech commented May 3, 2021

It would be good to start a new issue for the data loss. I am curious to know how often it affects other people. I haven't really seen it myself.

@BertLindeman
Copy link

Is the Linux test running in a virtual machine on the same hardware as Windows?

No The Linux Mint runs as separate PC (My LEGO PC to show building instructions and such)

It would be good to start a new issue for the data loss. I am curious to know how often it affects other people. I haven't really seen it myself.

Will do.

@dlech
Copy link
Member Author

dlech commented May 3, 2021

This could be my imagination or a coincidence, but it seems that more RF interference helps to reproduce this problem. I was having trouble reproducing this problem, but by enabling advertising on 3 other hubs while the Move hub is connected seems to significantly increase the likelihood of Bluetooth stopping on the Move hub.

Also, as Bert has already observed, another distinguishing feature of this bug compared to #306 is that Bluetooth disconnects rather than just "locking up".

It is possible to press the button on the hub to stop the program and go back to advertising. The Pybricks Code is able to reconnect and read the device info characteristics, but the UART characteristics no longer seems usable until rebooting the Move hub (Pybricks Code says "timeout waiting for checksum" in the console log).

@dlech dlech changed the title Move hub Bluetooth can stop working after some time Move hub Bluetooth can disconnect randomly while running a program May 3, 2021
@dlech
Copy link
Member Author

dlech commented May 3, 2021

I also noticed that after this disconnection are reconnection that the Bluetooth address is sometimes set to FF:FF:FF:FF:FF:FF, so it seems that the Bluetooth chip is not being fully reset as it should be, maybe?

After reconnection, the status notifications from the Pybricks GATT characteristic are received but the UART characteristics don't seem to work.

After reconnection, pressing the hub button to start the in-flash-memory program works once. But after stopping it, it won't start again.

@dlech dlech added the topic: bluetooth Issues involving bluetooth label May 3, 2021
@dlech
Copy link
Member Author

dlech commented May 3, 2021

The reconnection issue seems to be independent of the random disconnection issue, so I have opened #325 to track the reconnection issue.

@dlech
Copy link
Member Author

dlech commented Mar 23, 2023

Since pybricks/pybricks-micropython@efda5d0 I am no longer able to reproduce the issue even with 4 other hubs advertising, so hopefully that means this issue has been resolved.

@dlech dlech closed this as completed Mar 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working hub: movehub Issues related to the LEGO BOOST Move hub software: pybricks-micropython Issues with Pybricks MicroPython firmware (or EV3 runtime) topic: bluetooth Issues involving bluetooth
Projects
None yet
Development

No branches or pull requests

2 participants