-
-
Notifications
You must be signed in to change notification settings - Fork 32k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bluetooth manager saves and restores stale adverts in bluetooth.remote_scanners
#130432
Comments
Hey there @bdraco, mind taking a look at this issue as it has been labeled with an integration ( Code owner commandsCode owners of
(message by CodeOwnersMention) bluetooth documentation |
Since monotonic time is only guaranteed to be the same per run we have to convert to epoch when saving It looks like we don't have good guards for the host system experiencing time travel (time suddenly moving forwards or suddenly moving backwards), however it's unclear if that's the issue until I see the data You can share the data to my Dropbox [email protected] or my gdrive [email protected] |
I'm also bdraco on discord if that works better but I may be slow to respond there as it's a busy day for me |
Great, I have shared it to your dropbox (my account there is [email protected]). Also mentioned in a comment on the file the integration I think might be doing the thing. |
I'll try to look at it soon, unfortunately today is also turning out to be another very busy day |
Was hoping to be able to investigate this week but sadly I'm going to run out of time since I'm working on the aiohttp 3.11 release this week. Hopefully I'll be able to give this a look next week. |
The rouge scanner is storing the timestamps in milliseconds instead of seconds so they never expire |
|
@agittins Please name it here so it can get fixed. |
Bluetooth-Devices/bluetooth-adapters#177 will prevent restoring them so at least they will stop building up but it won't handle the case where the integration is injecting incorrect timestamps at run time which will never expire and pollute the whole stack |
Fantastic, thanks!
The custom component is https://github.com/kvj/hass_Bluetooth_Proxy which receives adverts from an Android companion app, https://github.com/kvj/hass_Bluetooth_Proxy_Companion I've raised an issue there at kvj/hass_Bluetooth_Proxy#3 |
I'm tempted to include a repair option in Bermuda for systems that have an unexpectedly large number of adverts, or adverts with impossible timestamps. I'm just not sure if it's a good thing to do or not. I guess my options are:
It feels a bit dirty to be messing with HA's internal storage like that though, but it also feels dirty to leave a system inoperable (either in part or in full) when a fix is possible. Obviously this isn't something that HA could "bless" in any way, but I'd be interested in your personal opinion on the options and whether this would be an egregious over-step or not. In some ways it feels safer to rename the file directly via the integration than it does to talk a user through doing it via ssh, so 🤷🏼♀️ |
Until the integration is fixed, it's going to keep polluting things at runtime anyway, and leaking memory, so I'd probably just wait for it to be fixed Additionally, every restart will clean it up now so it's probably not worth worrying about, as the above is probably worse |
Thanks for your thoughts. I think that "user installs my integration, system locks up" is actually worse than all the above! 😅 But yeah, I think I'll just let it "go away" when your fix gets to release. |
The problem
I maintain Bermuda and I've had a few cases where folks have had enormous
/config/.storage/bluetooth.remote_scanners
files (9MB or so), containing several months worth of advertisement data from a non-core integration, even after that integration has been uninstalled.I haven't yet been able to confirm which integration is causing the issue, but I do have a lead I haven't yet tested, so I don't feel comfortable publicly naming it if it's not the one at fault.
The issue for me is that Bermuda tries to process all known device advertisements, but when it hits this (in-memory) cache of over 10,000 adverts systems are bogging down and becoming non-responsive. To be clear - this is my own fault because I am accessing private members of the bluetooth.manager class rather than a proper API, so naturally I get to keep the smouldering consequences of my own questionable choices. I'll be changing Bermuda to harden itself against this issue in either case.
The issue for HA is that some systems are carrying this extra in-memory cache of very stale adverts, presumably indefinitely. I don't know if it only applies to adverts gathered between a certain range of HA versions, or if it's specific to a particuar custom integration's behaviour.
Manually solving the "issue" involves shutting down HA, renaming/removing the
bluetooth.remote_scanners
file and starting HA again - which is not ideal for folks not familiar with ssh etc.I think the expire adverts functions are not working in these cases because the adverts belong to a no-longer-present scanner, so they are sitting in the manager's
_all_histories
but not assigned to any extant scanner. I think this is also why the data persists, because HA might be modifying the saved data only for present scanners, and leaves the not-present scanner's cache data intact - but I haven't dug too deeply into the bluetooth_adaptors/storage.py to be sure.I have an example file of about 9MB, but as it comes from a user I'd rather not share it publicly, but I have their permission to email or provide a link via dm to my nextcloud - I'm on the discord so can be contacted there, or my email is [email protected] (I gave up on my own privacy decades ago!)
Some redacted excerpts of the
bluetooth.remote_scanners
file:Timestamps:
Adverts:
Running the file through
jq '.data."52e3fd636f144605f7665a0ae49aca33".discovered_device_advertisement_datas.[].device.address'
gives 12585 lines.What version of Home Assistant Core has the issue?
core-2024.11.1
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant Container
Integration causing the issue
bluetooth
Link to integration documentation on our website
https://www.home-assistant.io/integrations/bluetooth/
Diagnostics information
No response
Example YAML snippet
No response
Anything in the logs that might be useful for us?
Additional information
Note the bluetooth integration taking 2.76 seconds to load. This is on my dev machine with SSD, 32GB RAM, running 2024.11.0 in a devcontainer in vscode.
My apologies for not being able to provide complete diagnostics due to it being client data (and bluetooth diags not being redacted for MACs), but happy to provide them privately.
I have experimented with a "repair" locally, and it does clean the in-memory cache by identifying any unclaimed scanner's adverts, but doesn't affect the on-disk cache after a restart, plus this is probably not suitable for general use since a user might reboot multiple times while a scanner is not present, but still want the history restored later. The timestamps from the sample data appear to be unix epoch timestamps (vs monotonic).
The text was updated successfully, but these errors were encountered: