Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HASS Add-on broken on HASSOS 9.0 #1942

Closed
sephiroth1395 opened this issue Sep 15, 2022 · 51 comments
Closed

HASS Add-on broken on HASSOS 9.0 #1942

sephiroth1395 opened this issue Sep 15, 2022 · 51 comments
Labels
🐛 bug-report Something isn't working platform/ha-addon Home Assistant Add-on platform 👍 important This is an important issue/ticket with high priority

Comments

@sephiroth1395
Copy link

Describe the issue you are experiencing

Since the upgrade to HASSOS 9.0, the add-on cannot start. I suspect this is due to the docker migration to cgroups2.

Error message from the Supervisor:

22-09-15 13:34:43 ERROR (MainThread) [supervisor.docker.addon] Can't set cgroup permission on the host for addon_de838cd8_raspberrymatic
22-09-15 13:34:43 ERROR (MainThread) [asyncio] Task exception was never retrieved
future: <Task finished name='Task-14879' coro=<DockerAddon._hardware_events() done, defined at /usr/src/supervisor/supervisor/jobs/decorator.py:85> exception=DockerError("Can't set cgroup permission on the host for addon_de838cd8_raspberrymatic")>
Traceback (most recent call last):
  File "/usr/src/supervisor/supervisor/docker/addon.py", line 720, in _hardware_events
    await self.sys_dbus.agent.cgroup.add_devices_allowed(
  File "/usr/src/supervisor/supervisor/dbus/agent/cgroup.py", line 19, in add_devices_allowed
    await self.dbus.CGroup.AddDevicesAllowed(container_id, permission)
  File "/usr/src/supervisor/supervisor/utils/dbus.py", line 174, in call_dbus
    raise DBusFatalError(reply.body[0])
supervisor.exceptions.DBusFatalError: Error calling runc for '87542225ef424bd38c4d49ff3db5a8c46b277fdee1c3efd1a6aee5edd57d9d80': exit status 1, output time="2022-09-15T11:34:43Z" level=warning msg="could not find device group for '99/204' in /proc/devices -- temporarily ignoring rule: {99 204 -1 rwm true}"
time="2022-09-15T11:34:43Z" level=info msg="found more than one filter (2) attached to a cgroup -- removing extra filters!"
time="2022-09-15T11:34:43Z" level=info msg="removing old filter 0 from cgroup" id=139 name= run_count=0 runtime=0s tag=f231b56d360e591c type=CGroupDevice
time="2022-09-15T11:34:43Z" level=info msg="removing old filter 1 from cgroup" id=140 name= run_count=0 runtime=0s tag=be0b8d7ca6afd4d8 type=CGroupDevice
time="2022-09-15T11:34:43Z" level=warning msg="could not find device group for '99/204' in /proc/devices -- temporarily ignoring rule: {99 204 -1 rwm true}"
time="2022-09-15T11:34:43Z" level=info msg="found more than one filter (2) attached to a cgroup -- removing extra filters!"
time="2022-09-15T11:34:43Z" level=info msg="removing old filter 0 from cgroup" id=144 name= run_count=0 runtime=0s tag=839fa1b30f0f67c3 type=CGroupDevice
time="2022-09-15T11:34:43Z" level=info msg="removing old filter 1 from cgroup" id=147 name= run_count=0 runtime=0s tag=be0b8d7ca6afd4d8 type=CGroupDevice
time="2022-09-15T11:34:43Z" level=error msg="failed to call BPF_PROG_DETACH (BPF_CGROUP_DEVICE) on old filter program: can't detach program: no such file or directory"
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/usr/src/supervisor/supervisor/jobs/decorator.py", line 141, in wrapper
    raise err
  File "/usr/src/supervisor/supervisor/jobs/decorator.py", line 139, in wrapper
    return await self._method(*args, **kwargs)
  File "/usr/src/supervisor/supervisor/docker/addon.py", line 727, in _hardware_events
    raise DockerError(
supervisor.exceptions.DockerError: Can't set cgroup permission on the host for addon_de838cd8_raspberrymatic
22-09-15 13:34:43 ERROR (MainThread) [supervisor.docker.addon] Can't set cgroup permission on the host for addon_de838cd8_raspberrymatic

Describe the behavior you expected

The add-on successfully starts.

Steps to reproduce the issue

  1. Upgrade to HASSOS 9.0
  2. Try to start the add-on

What is the version this bug report is based on?

3.65.8.20220831

Which base platform are you running?

ha-addon (HomeAssistant Add-on)

Which HomeMatic/homematicIP radio module are you using?

HmIP-RFUSB

Anything in the logs that might be useful for us?

No.

Additional information

I'm using the ELV USB stick.

@sephiroth1395 sephiroth1395 added the 🐛 bug-report Something isn't working label Sep 15, 2022
@jens-maus jens-maus added the platform/ha-addon Home Assistant Add-on platform label Sep 15, 2022
@jens-maus
Copy link
Owner

@agners @pvizeli Is there something that automatically rings your bells when you read through this? As cgroupv2 support has been added to HomeAssistantOS 9.0, it seems users are coming up complaining about the "RaspberryMatic CCU" not working anymore... Something the addon has to do different now compared to v8?

@sephiroth1395
Copy link
Author

One thing I forgot to add: the situation prevents rfd from starting.

@H3xF2x
Copy link

H3xF2x commented Sep 15, 2022

I hit accidential the upgrade button - never install a x.0 version on a productive system :-(

Here the update prevents the hmip from starting:

Setting LAN Gateway keys: OK
Starting hs485d: OK
Starting multimacd: .OK
Starting rfd: .OK
Starting HMIPServer: .......................................................................................................................................................ERROR
Starting ReGaHss: .OK

@ozwo71
Copy link

ozwo71 commented Sep 15, 2022

...same for me but with RPI-RF-MOD:

Starting multimacd: .OK Starting rfd: ....................ERROR Starting HMIPServer: ..............OK

@jens-maus
Copy link
Owner

Obviously (as the error messages states) the cgroup settings/permissions cannot be setup correctly with HomeAssistantOS 9.0 at the moment. Thus, anyone using the RaspberryMatic CCU" Add-on in production should stay with "HomeAssistantOS v8 for the time being.

@H3xF2x
Copy link

H3xF2x commented Sep 15, 2022

Downgrade to OS 8.5 solved the bug.
FYI: ssh to port 22222 and execute
# ha os update --version 8.5

@agners
Copy link

agners commented Sep 15, 2022

Hm, this seems to be the relevant error:

supervisor.exceptions.DBusFatalError: Error calling runc for '87542225ef424bd38c4d49ff3db5a8c46b277fdee1c3efd1a6aee5edd57d9d80': exit status 1, output time="2022-09-15T11:34:43Z" level=warning msg="could not find device group for '99/204' in /proc/devices -- temporarily ignoring rule: {99 204 -1 rwm true}"

This is definitely caused by the move to CGroupsV2 (home-assistant/operating-system#1329).

It seems that device permissions set via runc has a problem with that particular device: could not find device group for '99/204' in /proc/devices -- temporarily ignoring rule: {99 204 -1 rwm true}

I'm using the ELV USB stick.

Is that the HM-CFG-USB-2? I do have such a stick somewhere, but can't find it right now 😢

I hit accidential the upgrade button - never install a x.0 version on a productive system :-(

Ideally, run a staging system and install beta/rc's there... Then .0 are painless 😉

@jens-maus
Copy link
Owner

Hm, this seems to be the relevant error:

supervisor.exceptions.DBusFatalError: Error calling runc for '87542225ef424bd38c4d49ff3db5a8c46b277fdee1c3efd1a6aee5edd57d9d80': exit status 1, output time="2022-09-15T11:34:43Z" level=warning msg="could not find device group for '99/204' in /proc/devices -- temporarily ignoring rule: {99 204 -1 rwm true}"

This is definitely caused by the move to CGroupsV2 (home-assistant/operating-system#1329).

It seems that device permissions set via runc has a problem with that particular device: could not find device group for '99/204' in /proc/devices -- temporarily ignoring rule: {99 204 -1 rwm true}

I'm using the ELV USB stick.

Is that the HM-CFG-USB-2? I do have such a stick somewhere, but can't find it right now 😢

Which info do you need from these Homematic USB devices to be used by RaspberryMatic? And please note, it seems that even users with a RPI-RF-MOD seem to have issues since 9.0

@ozwo71
Copy link

ozwo71 commented Sep 15, 2022

And please note, it seems that even users with a RPI-RF-MOD seem to have issues since 9.0

I can confirm this. After the downgrade to 8.5 everything is fine again.

@agners
Copy link

agners commented Sep 15, 2022

And please note, it seems that even users with a RPI-RF-MOD seem to have issues since 9.0

Hm, are those using UART I guess? So enabling access to UART fails in this case?

I am not 100% sure what I need, I need to understand why my test case (deCONZ and a couple of USB sticks) works while this one does not. It seems that runc rejects that device on this line: https://github.com/opencontainers/runc/blob/v1.1.4/libcontainer/cgroups/systemd/common.go#L279

@H3xF2x
Copy link

H3xF2x commented Sep 15, 2022

And please note, it seems that even users with a RPI-RF-MOD seem to have issues since 9.0

Well, I'm using a HM-MOD-RPI-PCB, seems every radio is affected.

Ideally, run a staging system and install beta/rc's there... Then .0 are painless

Oh yes - it wasn't my intention to debug on the production system ...

@jens-maus
Copy link
Owner

And please note, it seems that even users with a RPI-RF-MOD seem to have issues since 9.0

Well, I'm using a HM-MOD-RPI-PCB, seems every radio is affected.

How do you guys have your HM-MOD-RPI-PCB or RPI-RF-MOD connected? Via GPIO or via a HB-RF-USB adapter?

@H3xF2x
Copy link

H3xF2x commented Sep 15, 2022

Via GPIO (CM4 via mainboard)

@jens-maus
Copy link
Owner

And please note, it seems that even users with a RPI-RF-MOD seem to have issues since 9.0

Hm, are those using UART I guess? So enabling access to UART fails in this case?

I am not 100% sure what I need, I need to understand why my test case (deCONZ and a couple of USB sticks) works while this one does not. It seems that runc rejects that device on this line: https://github.com/opencontainers/runc/blob/v1.1.4/libcontainer/cgroups/systemd/common.go#L279

Well, this could easily be two independent issues here. One with using a USB connected RF Module like HmIP-RFUSB or HM-CFG-USB-2 like the author of this issue ticket here. And the other one, e.g. the eq3_char_loop kernel module not being able to dynamically generate the /dev/mmd_bidcos /dev/mmd_hmip device node pairs upon startup due to other more restrictive permissions in cgroupsv2.

@ozwo71
Copy link

ozwo71 commented Sep 15, 2022

I'm using the RPI-RF-MOD via GPIO

@jens-maus
Copy link
Owner

Via GPIO (CM4 via mainboard)

Do you see any similar permission or cgroupv2 related issues in the HomeAssistant logfiles? Because you only showed logs of the RaspberryMatic Addon startup, but not showing if HomeAssistantOS was logging some issues as well.

@H3xF2x
Copy link

H3xF2x commented Sep 15, 2022

Do you see any similar permission or cgroupv2 ...

Seems the logs are gone afert downgrade and restore, sorry.

@ozwo71
Copy link

ozwo71 commented Sep 15, 2022

...I will do the upgrade again...

@hpsteff
Copy link

hpsteff commented Sep 15, 2022

same effect on Raspberry 4 running HA OS 9 and Raspberrymatic add on. I have the RPI-RF-MOD directly on GPIO. with OS 9 it does not work
-> downgraded to Ha Os 8,5. it works again.

@ozwo71
Copy link

ozwo71 commented Sep 15, 2022

home-assistant.log only shows:

home-assistant.log.1:2022-09-15 17:31:12.344 WARNING (MainThread) [hahomematic.central_unit] create_clients failed: Unable to create client for central [('Unable to authenticate (<ProtocolError for de838cd8-raspberrymatic:2001/RPC2: 401 Unauthorized>,).',)]. Check logs.

home-assistant.log.1:2022-09-15 17:31:13.581 WARNING (MainThread) [hahomematic.central_unit] create_clients failed: Unable to create client for central [('Unable to authenticate (<ProtocolError for de838cd8-raspberrymatic:2010/RPC2: 401 Unauthorized>,).',)]. 

home-assistant.log.1:2022-09-15 17:31:43.627 WARNING (MainThread) [hahomematic.central_unit] check_connection failed: No clients exist. Trying to create clients for server RaspberryMatic

Is there any special debug setting or another log that will help?

@ozwo71
Copy link

ozwo71 commented Sep 15, 2022

...Interesting: I had the problem after upgrading to 9.0 and found this issue when neither restarting the add-on nor rebooting the raspbi helped.

Then downgraded again to 8.5 using
ha os update --version 8.5

To reproduce the error I just did
ha os update

Again I am on 9.0 and guess what: RaspberryMatic is running after rebooting...

@pvizeli
Copy link

pvizeli commented Sep 15, 2022

Does a restart of the add-on fix the issue? (Just for debugging)

@agners
Copy link

agners commented Sep 15, 2022

@sephiroth1395 do you happen to have access to the OS shell? If so, can you get the logs of journalctl -u haos-agent.service and a capture of /proc/devices when the device is plugged in?

@pvizeli
Copy link

pvizeli commented Sep 15, 2022

Via GPIO (CM4 via mainboard)

Do you see any similar permission or cgroupv2 related issues in the HomeAssistant logfiles? Because you only showed logs of the RaspberryMatic Addon startup, but not showing if HomeAssistantOS was logging some issues as well.

image

The issue is that the driver seems not to work correctly and gets somehow closed after it gets created. I guess they have to be fixed in a upcoming release

@andreas-bulling
Copy link

same problem here. Downgrade to OS 8.5 helped

@pvizeli
Copy link

pvizeli commented Sep 15, 2022

@jens-maus
image

@Nielsvo
Copy link

Nielsvo commented Sep 15, 2022

I am using the HmIP-RFUSB USB stick on my Synology NAS (OVA). Had the same issue when upgrading to 9.0 and downgraded to 8.5 and then it worked again. But I can't seem to upgrade again to 9.0 now? No update available?

@JoMass
Copy link

JoMass commented Sep 17, 2022

Since I have this problem, I searched hard how to go back to version 8.5, but could not find a suitable method.
Could someone give me a tip on how to reset HA OS to the previous version, please, thanks

@H3xF2x
Copy link

H3xF2x commented Sep 17, 2022

@JoMass
Enable ssh on Hassos
ssh to port 22222 and execute
# ha os update --version 8.5

@JoMass
Copy link

JoMass commented Sep 17, 2022

Great, worked perfectly; Thank you @H3xF2x

@jens-maus
Copy link
Owner

jens-maus commented Sep 18, 2022

Please note that with the next nightly snapshot this issues should be hopefully solved/fixed by having introduced an additional sleep 5 call after multimacd startup. Also note that a newer HomeAssistant supervisor version (Supervisor 2022.09.1 or newer) seem to be required as well.

@opu1000
Copy link

opu1000 commented Sep 28, 2022

Hello all, hello Jens, is there a date planned to publish the new version with the fix?
Regards OPU

@jens-maus
Copy link
Owner

@JoMass
Copy link

JoMass commented Oct 6, 2022

I'm sorry to report, the latest release still does not works with HS-OS9.0
Downgrade to 8.5 and raspberrymatic runs as it should

Logfile
Mounting /data as /usr/local (Home Assistant Add-On): OK
Starting watchdog...
Identifying host system: oci, OK
Initializing RTC Clock: onboard, OK
Running sysctl: OK
Checking for Factory Reset: not required
Checking for Backup Restore: not required
Initializing System: OK
Setup ca-certificates: OK
Starting logging: OK
Init onboard LEDs: init, OK
Starting irqbalance: OK
Starting iptables: OK
Starting network: eth0: link up, fixed, firewall, inet up, 172.30.33.2, OK
Identifying Homematic RF-Hardware: ....HmRF: HMIP-RFUSB/eQ-3 [email protected], HmIP: HMIP-RFUSB/eQ-3 [email protected], OK
Updating Homematic RF-Hardware: HMIP-RFUSB: 4.4.18, not necessary, OK
Starting hs485dLoader: disabled
Starting xinetd: OK
Starting eq3configd: OK
Starting lighttpd: OK
Starting ser2net: disabled
Starting ssdpd: OK
Starting sshd: OK
Starting ha-proxy: OK
Starting NUT services: disabled
Initializing Third-Party Addons: OK
Starting LGWFirmwareUpdate: ...OK
Setting LAN Gateway keys: OK
Starting hs485d: disabled
Starting multimacd: ..............ERROR
Starting rfd: ERROR: /dev/mmd_bidcos missing, no BidCos-RF hardware found
Starting HMIPServer: ERROR: /dev/mmd_hmip missing
Starting ReGaHss: .OK
Starting CloudMatic: OK
Starting Third-Party Addons: OK
Starting crond: OK
Setup onboard LEDs: booted, OK
Finished Boot: 3.65.11.20221005 (raspmatic_oci_arm64)

@jens-maus
Copy link
Owner

I'm sorry to report, the latest release still does not works with HS-OS9.0

Whats the bade hardware on which you have installed HomeAssistantOS?

@JoMass
Copy link

JoMass commented Oct 7, 2022

Odroid N2+ 4GB 64GB-SSD

@jens-maus
Copy link
Owner

jens-maus commented Oct 7, 2022

Odroid N2+ 4GB 64GB-SSD

Ok, already guessed that you have a ODROID. Then please look here: #1969

@JoMass
Copy link

JoMass commented Oct 7, 2022

Thank you Jens for the quick resposne. I'll wait for the next HomeAsistantOS 9.x release

@andreas-bulling
Copy link

Does it work with HomeAssistant OS v9.2?

@ChristophCaina
Copy link

@andreas-bulling yes, for me it was working, even I have the information still in my notifications after the reboot.
But the Addon itself is starting as expected.
grafik

I guess, with the new release of the Addon (recently) there is a delay in the startup sequence, and this is causing the log message.

@FlapFlup
Copy link

Hello, in my system (Odroid with HmIP-RFUSB) the error no longer occurs under HomeAssistant OS v9.2.

@andreas-bulling
Copy link

I can confirm that it also works for me with HomeAssistant OS 9.2

@jens-maus
Copy link
Owner

Ok, I think that we have definitely enough evidence that the issue raised here is solved. So no reason to misuse this ticket system as a discussion or messaging platform.

Repository owner locked and limited conversation to collaborators Oct 12, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
🐛 bug-report Something isn't working platform/ha-addon Home Assistant Add-on platform 👍 important This is an important issue/ticket with high priority
Projects
None yet
Development

No branches or pull requests