-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
elastic-agent installed as Fleet server remains stuck in uninstalling with retry errors for over 4-5 minutes. #5752
Comments
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
@muskangulati-qasource Please review. |
Secondary review is Done for this ticket! |
@michel-laterman this is probably from the uninstall notification, at minimum there should be better error messages here if Fleet Server was legitimately unavailable. |
Hi Team, Observations:
Screen Recording: Amol.Self.WindowsBC.-.ec2-35-174-4-132.compute-1.amazonaws.com.-.Remote.Desktop.Connection.2024-10-24.11-22-29.mp4Amol.Self.WindowsBC.-.ec2-35-174-4-132.compute-1.amazonaws.com.-.Remote.Desktop.Connection.2024-10-24.12-00-38.mp4Build details: Hence, we are reopening this issue. Thanks!! |
For the failures you posted: Did you destroy the ES deployment before uninstalling the agent instance? If so the 1st instances failure messages are expected; it should have taken a shorter amount of time then 4m as originally reported. do you recall how long it took? Is the panic that occured in the 2nd scenario common? Do you have steps on how to recreate? I would like to recreate this with a binary where the symbols are still present because the failure when trying to marshall json seems strange to me. |
Thank you for looking into this issue.
Please find below steps for this:
The issue is reproducible for 8.17.0-SNAPSHOT agent. Screen Recording: Amol.Windows.2025.-.ec2-18-212-146-218.compute-1.amazonaws.com.-.Remote.Desktop.Connection.2024-11-19.15-32-00.mp4
This issue is inconsistently observed when VM is shut down for a long time and then agent is uninstalled. Build details: Please let us know if anything else is required from our end. |
panics are being tracked in their own issue: #5952 |
This issue occurs if fleet-server is running locally. The ways we can fix it are either:
|
I would go with this except I would change the logic of performing it before only in the case that this Elastic Agent is running the Fleet Server. That means the rare case that it notifies before uninstall actually happens only occurs on hosts that are running on Fleet Server. This reduces the likely hood of a failure after notify to only occur on hosts that are running a Fleet Server which is few. |
Hi Team, We have revalidated this issue on latest 8.18.0 SNAPSHOT and found it fixed now. Observations:
Build details: Hence, we are marking this issue as QA:Validated. Thanks!! |
Kibana Build details:
Artifact: https://snapshots.elastic.co/8.16.0-106cdbc2/downloads/beats/elastic-agent/elastic-agent-8.16.0-SNAPSHOT-windows-x86_64.zip
Host: Windows Server 2022- Test Signing ON
Preconditions:
Steps to reproduce:
.\Elastic\Agent\elastic-agent.exe uninstall
.Expected Result:
elastic-agent installed as Fleet server should not remain stuck in uninstalling with retry errors for over 4-5 minutes.
Screenshots:
The text was updated successfully, but these errors were encountered: