You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The osquerybeat runs a couple of child processes so the whole chain looks like this
agent->osquerybeat->osqueryd->osquery-extension
On windows it looks like when the osquerybeat deleted/uninstalled the process could have been killed by the agent, leaving osqueryd.exe orphaned running, so the install directory can not be deleted especially on windows since the file is in use.
When the next time the agent is to install osquerybeat it skips the install step because the osquerybeat install directory is already there. Osquerybeat install ends up being corrupted and osquerybeat.exe can't be started because it doesn't exists on the disk.
The Osquerybeat implementation on windows uses the following approach to kill the whole process tree if needed:
Maybe agent should do something similar, which would help the cases where the agent just kills the intermediate child?
It seems there are a couple of things that could be done to improve the situation:
Better tracking of child processes and cleaner process tree kill.
Maybe, some install state metadata on the disk that would allow to properly reinstall the product even in the cases where the install directory was not properly deleted cleaned.
The text was updated successfully, but these errors were encountered:
@aleksmaus This is closed by the Windows Job work correct?
the child processes kills on windows should be handled now with this merged #30254
there is one more improvement we could do as mentioned above: the beats install code to detect corrupt installs.
I added that to my TODO list: to learn how it's done and see what we can do to improve that.
Before doing that though I'm looking at the slow agent shutdown issue, that could potentially help with the cases where system just kills the service process after timeout.
so we can close this one and open another tracker for install improvement, or keep it and will start looking at install code as soon as I can.
We are seeing some cases in the field with osquerybeat where the install is corrupted on Windows.
https://discuss.elastic.co/t/osquery-manger-integration-wont-work-on-windows/295529/3
The osquerybeat runs a couple of child processes so the whole chain looks like this
agent->osquerybeat->osqueryd->osquery-extension
On windows it looks like when the osquerybeat deleted/uninstalled the process could have been killed by the agent, leaving osqueryd.exe orphaned running, so the install directory can not be deleted especially on windows since the file is in use.
When the next time the agent is to install osquerybeat it skips the install step because the osquerybeat install directory is already there. Osquerybeat install ends up being corrupted and osquerybeat.exe can't be started because it doesn't exists on the disk.
The Osquerybeat implementation on windows uses the following approach to kill the whole process tree if needed:
beats/x-pack/osquerybeat/internal/osqd/osqueryd_windows.go
Line 45 in d00c2fe
Maybe agent should do something similar, which would help the cases where the agent just kills the intermediate child?
It seems there are a couple of things that could be done to improve the situation:
The text was updated successfully, but these errors were encountered: