-
Notifications
You must be signed in to change notification settings - Fork 83
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
featured: use run() to run cli commands in place of check_call() (#177)
Fixes: sonic-net/sonic-buildimage#20662 During some reboots, it was observed that some times featured.service script command fails to start the services like pmon, snmp, lldp etc. From logs, it was observed that 'sudo systemctl enable ' command failed with errorcode 13 (SIGPIPE. 2024 Oct 29 01:31:26.191236 aaa14-rp INFO featured: Running cmd: '['sudo', 'systemctl', 'unmask', 'pmon.service']' 2024 Oct 29 01:31:26.211167 aaa14-rp INFO systemd[1]: Reloading. 2024 Oct 29 01:31:27.212381 aaa14-rp INFO featured: Running cmd: '['sudo', 'systemctl', 'enable', 'pmon.service']' 2024 Oct 29 01:31:27.232428 aaa14-rp INFO systemd[1]: Reloading. 2024 Oct 29 01:31:28.135667 aaa14-rp ERR featured: ['sudo', 'systemctl', 'enable', 'pmon.service'] - failed: return code - -13, output:#012None 2024 Oct 29 01:31:28.135746 aaa14-rp ERR featured: Feature 'pmon.service' failed to be enabled and started 2024 Oct 29 01:34:08.661711 aaa14-rp INFO featured: Running cmd: '['sudo', 'systemctl', 'enable', 'gnmi.service']' 2024 Oct 29 01:34:08.677242 aaa14-rp INFO systemd[1]: Reloading. 2024 Oct 29 01:34:09.316554 aaa14-rp ERR featured: ['sudo', 'systemctl', 'enable', 'gnmi.service'] - failed: return code - -13, output:#012None 2024 Oct 29 01:34:09.316791 aaa14-rp ERR featured: Feature 'gnmi.service' failed to be enabled and started The issue does not recover and the pmon and other services never starts. On supervisor this also leads to swss, syncd and other related docker to stay down. In general systemctl enable does not work for some services like pmon, snmp, lldp etc as there is no WantBy directive set for these services in unit file. The command returns stderr : "The unit files have no installation config (WantedBy=, RequiredBy=, Also=, Alias= settings in the [Install] section, and DefaultInstance= for template units). This means they are not meant to be enabled using systemctl. Possible reasons for having this kind of units are: • A unit may be statically enabled by being symlinked from another unit's .wants/ or .requires/ directory. • A unit's purpose may be to act as a helper for some other unit which has a requirement dependency on it. • A unit may be started when needed via activation (socket, path, timer, D-Bus, udev, scripted systemctl call, ...). • In case of template units, the unit is meant to be enabled with some instance name specified. ” featured python script uses subprocess.check_call() function to invoke the command which looks like is not very reliable at handling the stderr and may cause SIGPIPE with big buffer data. Modifying the function to use subprocess.run() resolves this issue. run() is more reliable at handing the return data. Validated the change with multiple reboots.
- Loading branch information
Showing
3 changed files
with
192 additions
and
187 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.