-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elastic-Agents unexpectedly unenrolled after update to 8.16.x #6213
Comments
You could collect a diagnostics bundle for the specific Agent. If you extract the bundle, you'll see a folder in it for logs. You could try to attach those files here or post the logs in a gist and link it from here. |
Here the agent-log from the day of the upgrade:
|
Hmmmm, not seeing anything about (un)enrollment in the upgrade day logs. It looks like you have debug level logging turned on for Agent. Do you see a "handlerUnenroll: action '... UNENROLL ...' received" message anywhere in your logs from November 25 through December 2nd? |
@ycombinator @michel-laterman can we make this one of our P0 please? I'd like to make sure that this is not a bug on our end or if it is get it fixed. |
Pinging @elastic/elastic-agent (Team:Elastic-Agent) |
@ycombinator : there's no occurence of any On day of host-reboot (elastic-agent service shutdown) - Nov. 30th - we see this
Dec. 2nd - first start of agent after host reboot:
` |
In case it's relevant: some windows-hosts with agents in that strange state (active and unenrolled) seem to be unable to start the agent-service at boot time anymore. I cannot find any log entry anywhere matching this error (maybe in kibana.log - I didn't check that one ...) |
I think that upgrading from 8.16.0 -> 8.16.1 calls uninstall on 8.16.0, which makes an API call to Can you provide a diagnostics bundle of a failing agent? |
Doing a quick search of the codebase, i don't see a call to Uninstall anywhere within the upgrade handler, so my theory may be incorrect |
Version:
8.16.0 and 8.16.1
Operating System:
Windows Server 2019 (64-bit), Microsoft Windows Server 2022 (64-bit)
Discuss Forum URL:
https://discuss.elastic.co/t/elastic-agent-unexpectedly-gets-unenrolled-version-8-16-x/371463
related Links:
[Fleet] Agents erroneously showing "Agent not upgradeable: agent has been unenrolled" kibana#202440
[Fleet]: Unable to force unenroll multiple offline agents from bulk actions. kibana#197180
Steps to Reproduce:
Agent-Logs (.fleet-agent) shows timestamp of 2) ->
unenrolled_at
and timestamp of 3) ->upgraded_at
Example-Logs and Screenshots:
Upgraded agent from 8.16.0 to 8.16.1 on Nov. 25th
Rebooted Host on Nov. 30th
Started Agent-Service on Dec. 2nd
Agent-Log from day of upgrade (I think, the error happens here):
too long to fit here -> could someone provide a hint how to export the relevant log entries from a specific agent?
Agent-Log from day of reboot & day of service-start (copy/paste from Kibana->Fleet->Agent->specific agent->Logs):
Nov 30, 2024
20:09:04.513
elastic_agent
[elastic_agent][info] signal "terminated" received
20:09:04.513
elastic_agent
[elastic_agent][info] Shutting down Elastic Agent and sending last events...
20:09:04.520
elastic_agent
[elastic_agent][warn] Possible transient error during checkin with fleet-server, retrying
20:09:04.552
elastic_agent
[elastic_agent][error] failed accept conn info connection: use of closed network connection
20:09:04.552
elastic_agent
[elastic_agent][info] stopping endpoint service runtime
20:09:04.720
elastic_agent
[elastic_agent][info] Shutting down completed.
20:09:04.728
elastic_agent
[elastic_agent][info] Stopping monitoring server
20:09:04.728
elastic_agent
[elastic_agent][info] Stats endpoint (127.0.0.1:6791) finished: accept tcp 127.0.0.1:6791: use of closed network connection
Dec 2, 2024
09:25:50.042
elastic_agent
[elastic_agent][info] Elastic Agent started
09:25:50.331
elastic_agent
[elastic_agent][info] Starting upgrade watcher
09:25:50.365
elastic_agent
[elastic_agent][info] Upgrade Watcher invoked
09:25:50.692
elastic_agent
[elastic_agent][info] Upgrade Watcher started
09:25:50.708
elastic_agent
[elastic_agent][info] Loaded update marker &{Version:8.16.1 Hash:b6da7f VersionedHome:data\elastic-agent-8.16.1-b6da7f UpdatedOn:2024-11-25 12:20:00.3690588 +0100 CET PrevVersion:8.16.0 PrevHash:3f07f2 PrevVersionedHome:data\elastic-agent-8.16.0-3f07f2 Acked:false Action:id: f0d5d0c4-b283-419e-b826-a8e830f755cc, type: UPGRADE Details:}
09:25:50.714
elastic_agent
[elastic_agent][info] not within grace [updatedOn 2024-11-25 12:20:00.3690588 +0100 CET] 165h5m50.3458541s
09:25:50.714
elastic_agent
[elastic_agent][info] Cleaning up upgrade
09:25:50.828
elastic_agent
[elastic_agent][info] APM instrumentation disabled
09:25:50.838
elastic_agent
[elastic_agent][info] Gathered system information
09:25:50.870
elastic_agent
[elastic_agent][info] Detected available inputs and outputs
09:25:50.870
elastic_agent
[elastic_agent][info] Capabilities file not found in C:\Program Files\Elastic\Agent\capabilities.yml
09:25:50.870
elastic_agent
[elastic_agent][info] Determined allowed capabilities
09:25:50.870
elastic_agent
[elastic_agent][info] Loading baseline config from C:\Program Files\Elastic\Agent\elastic-agent.yml
09:25:51.312
elastic_agent
[elastic_agent][info] GRPC comms socket listening at localhost:6789
09:25:51.439
elastic_agent
[elastic_agent][info] Parsed configuration and determined agent is managed by Fleet
09:25:51.439
elastic_agent
[elastic_agent][warn] SSL/TLS verifications disabled.
09:25:51.652
elastic_agent
[elastic_agent][info] GRPC control socket listening at npipe:///elastic-agent-system
09:25:51.656
elastic_agent
[elastic_agent][info] updated upgrade details
09:25:51.660
elastic_agent
[elastic_agent][info] Starting grpc control protocol listener on port 6789 with max_message_size 104857600
09:25:51.660
elastic_agent
[elastic_agent][info] Docker provider skipped, unable to connect: protocol not available
09:25:51.879
elastic_agent
[elastic_agent][warn] SSL/TLS verifications disabled.
09:25:52.986
elastic_agent
[elastic_agent][info] restoring current policy from disk
09:25:53.030
elastic_agent
[elastic_agent][info] Setting fallback log level from policy
09:25:53.067
elastic_agent
[elastic_agent][info] Fleet gateway started
09:25:53.080
elastic_agent
[elastic_agent][info] Source URI changed from "https://artifacts.elastic.co/downloads/" to "https://artifacts.elastic.co/downloads/"
09:25:53.080
elastic_agent
[elastic_agent][info] Starting monitoring server with cfg &config.MonitoringConfig{Enabled:true, MonitorLogs:true, MonitorMetrics:false, MetricsPeriod:"", LogMetrics:true, HTTP:(*config.MonitoringHTTPConfig)(0xc00067d5f0), Namespace:"default", Pprof:(*config.PprofConfig)(nil), MonitorTraces:false, APM:config.APMConfig{Environment:"", APIKey:"", SecretToken:"", Hosts:[]string(nil), GlobalLabels:map[string]string(nil), TLS:config.APMTLS{SkipVerify:false, ServerCertificate:"", ServerCA:""}, SamplingRate:(*float32)(nil)}, Diagnostics:config.Diagnostics{Uploader:config.Uploader{MaxRetries:10, InitDur:1000000000, MaxDur:600000000000}, Limit:config.Limit{Interval:60000000000, Burst:1}}}
09:25:53.083
elastic_agent
[elastic_agent][info] creating monitoring API with cfg api.Config{Enabled:true, Host:"http://localhost:6791", Port:6791, User:"", SecurityDescriptor:"", Timeout:5000000000}
09:25:53.084
elastic_agent
[elastic_agent][info] Starting stats endpoint
09:25:53.105
elastic_agent
[elastic_agent][info] Metrics endpoint listening on: 127.0.0.1:6791 (configured: http://localhost:6791)
09:25:53.107
elastic_agent
[elastic_agent][info] Updating running component model
09:25:54.355
elastic_agent
[elastic_agent][info] Creating connection info server for endpoint service, address: npipe:///.eaci.sock
09:25:54.356
elastic_agent
[elastic_agent][info] check if endpoint service is installed
09:25:54.552
elastic_agent
endpoint-default
[elastic_agent][info] Spawned new component endpoint-default: Starting: endpoint service runtime
09:25:54.552
elastic_agent
endpoint-default
[elastic_agent][info] Spawned new unit endpoint-default-85821b10-0064-11ee-b676-af36e033a9ae: Starting: endpoint service runtime
09:25:54.553
elastic_agent
endpoint-default
[elastic_agent][info] Spawned new unit endpoint-default: Starting: endpoint service runtime
09:25:56.570
elastic_agent
[elastic_agent][error] 2024-12-02 08:25:56: info: Main.cpp:569 Verifying existing installation
09:25:56.574
elastic_agent
[elastic_agent][error] 2024-12-02 08:25:56: info: InstallLib.cpp:611 Running [C:\Program Files\Elastic\Endpoint\elastic-endpoint.exe] [version --log stdout]
09:25:56.574
elastic_agent
[elastic_agent][error] 2024-12-02 08:25:56: debug: Service.cpp:804 PPL is supported. This process is unprotected. (TrustLevelSid: absent)
09:25:57.282
elastic_agent
[elastic_agent][error] 2024-12-02 08:25:57: info: InstallLib.cpp:650 Installed endpoint is expected version (version: 8.16.1, compiled: Tue Nov 19 12:00:00 2024, branch: HEAD, commit: 7d50b182b0f0ddc7170095904dc1e341224bb1f4)
09:25:57.282
elastic_agent
[elastic_agent][error] 2024-12-02 08:25:57: info: Util.cpp:2146 Endpoint Service is running.
09:25:57.286
elastic_agent
[elastic_agent][info] after check if endpoint service is installed, err:
09:26:01.240
elastic_agent
winlog-default
[elastic_agent][info] Spawned new component winlog-default: Starting: spawned pid '20108'
09:26:01.241
elastic_agent
winlog-default
[elastic_agent][info] Spawned new unit winlog-default-winlog-system-85821b11-0064-11ee-b676-af36e033a9ae: Starting: spawned pid '20108'
09:26:01.241
elastic_agent
winlog-default
[elastic_agent][info] Spawned new unit winlog-default-winlog-windows-85821b12-0064-11ee-b676-af36e033a9ae: Starting: spawned pid '20108'
09:26:01.241
elastic_agent
winlog-default
[elastic_agent][info] Spawned new unit winlog-default: Starting: spawned pid '20108'
09:26:03.099
elastic_agent
[elastic_agent][info] control checkin v2 protocol has chunking enabled
09:26:03.100
elastic_agent
winlog-default
[elastic_agent][info] Component state changed winlog-default (STARTING->HEALTHY): Healthy: communicating with pid '20108'
09:26:03.175
elastic_agent
[elastic_agent][info] control checkin v2 protocol has chunking enabled
09:26:04.111
elastic_agent
winlog-default
[elastic_agent][info] Unit state changed winlog-default (STARTING->HEALTHY): Healthy
09:26:04.114
elastic_agent
winlog-default
[elastic_agent][info] Unit state changed winlog-default-winlog-system-85821b11-0064-11ee-b676-af36e033a9ae (STARTING->HEALTHY): Healthy
09:26:04.114
elastic_agent
winlog-default
[elastic_agent][info] Unit state changed winlog-default-winlog-windows-85821b12-0064-11ee-b676-af36e033a9ae (STARTING->HEALTHY): Healthy
09:26:10.278
elastic_agent
endpoint-default
[elastic_agent][info] Component state changed endpoint-default (STARTING->HEALTHY): Healthy: communicating with endpoint service
09:26:10.818
elastic_agent
[elastic_agent][info] Removing marker file
09:26:10.822
elastic_agent
[elastic_agent][info] Removing previous symlink path
09:26:10.822
elastic_agent
[elastic_agent][error] clean up of prior watcher run failedextracting elastic-agent path relative to data directory from C:\Program Files\Elastic\Agent\data\elastic-agent-8.16.1-b6da7f: Rel: can't make C:\Program Files\Elastic\Agent\data\elastic-agent-8.16.1-b6da7f relative to data
09:26:15.763
elastic_agent
endpoint-default
[elastic_agent][info] Unit state changed endpoint-default-85821b10-0064-11ee-b676-af36e033a9ae (STARTING->HEALTHY): Applied policy {85821b10-0064-11ee-b676-af36e033a9ae}
09:26:15.764
elastic_agent
endpoint-default
[elastic_agent][info] Unit state changed endpoint-default (STARTING->HEALTHY): Applied policy {85821b10-0064-11ee-b676-af36e033a9ae}
09:26:21.526
elastic_agent
[elastic_agent][info] component model updated
09:26:21.527
elastic_agent
[elastic_agent][info] Updating running component model
11:16:50.539
elastic_agent
endpoint-default
[elastic_agent][info] Unit state changed endpoint-default (HEALTHY->CONFIGURING): Applied policy {85821b10-0064-11ee-b676-af36e033a9ae}
11:16:50.539
elastic_agent
endpoint-default
[elastic_agent][info] Unit state changed endpoint-default-85821b10-0064-11ee-b676-af36e033a9ae (HEALTHY->CONFIGURING): Applied policy {85821b10-0064-11ee-b676-af36e033a9ae}
11:17:10.534
elastic_agent
endpoint-default
[elastic_agent][info] Unit state changed endpoint-default-85821b10-0064-11ee-b676-af36e033a9ae (CONFIGURING->HEALTHY): Applied policy {85821b10-0064-11ee-b676-af36e033a9ae}
11:17:10.534
elastic_agent
endpoint-default
[elastic_agent][info] Unit state changed endpoint-default (CONFIGURING->HEALTHY): Applied policy {85821b10-0064-11ee-b676-af36e033a9ae}
13:28:44.169
elastic_agent
[elastic_agent][warn] SSL/TLS verifications disabled.
13:28:44.170
elastic_agent
[elastic_agent][warn] SSL/TLS verifications disabled.
13:28:44.185
elastic_agent
[elastic_agent][info] Setting fallback log level from policy
13:28:44.218
elastic_agent
[elastic_agent][warn] SSL/TLS verifications disabled.
13:28:44.236
elastic_agent
[elastic_agent][info] Source URI changed from "https://artifacts.elastic.co/downloads/" to "https://artifacts.elastic.co/downloads/"
13:28:44.237
elastic_agent
[elastic_agent][info] Stopping monitoring server
13:28:44.237
elastic_agent
[elastic_agent][info] Stats endpoint (127.0.0.1:6791) finished: accept tcp 127.0.0.1:6791: use of closed network connection
13:28:44.238
elastic_agent
[elastic_agent][info] Starting monitoring server with cfg &config.MonitoringConfig{Enabled:true, MonitorLogs:true, MonitorMetrics:false, MetricsPeriod:"", LogMetrics:true, HTTP:(*config.MonitoringHTTPConfig)(0xc00078af90), Namespace:"default", Pprof:(*config.PprofConfig)(nil), MonitorTraces:false, APM:config.APMConfig{Environment:"", APIKey:"", SecretToken:"", Hosts:[]string(nil), GlobalLabels:map[string]string(nil), TLS:config.APMTLS{SkipVerify:false, ServerCertificate:"", ServerCA:""}, SamplingRate:(*float32)(nil)}, Diagnostics:config.Diagnostics{Uploader:config.Uploader{MaxRetries:10, InitDur:1000000000, MaxDur:600000000000}, Limit:config.Limit{Interval:60000000000, Burst:1}}}
13:28:44.238
elastic_agent
[elastic_agent][info] creating monitoring API with cfg api.Config{Enabled:true, Host:"http://localhost:6791", Port:6791, User:"", SecurityDescriptor:"", Timeout:5000000000}
13:28:44.240
elastic_agent
[elastic_agent][info] Starting stats endpoint
13:28:44.242
elastic_agent
[elastic_agent][info] Metrics endpoint listening on: 127.0.0.1:6791 (configured: http://localhost:6791)
13:28:44.267
elastic_agent
[elastic_agent][info] component model updated
13:28:44.267
elastic_agent
[elastic_agent][info] Updating running component model
13:28:44.274
elastic_agent
winlog-default
[elastic_agent][info] Unit state changed winlog-default-winlog-windows-85821b12-0064-11ee-b676-af36e033a9ae (HEALTHY->CONFIGURING): Configuring
13:28:44.274
elastic_agent
winlog-default
[elastic_agent][info] Unit state changed winlog-default-winlog-system-85821b11-0064-11ee-b676-af36e033a9ae (HEALTHY->CONFIGURING): Configuring
13:28:44.300
elastic_agent
[elastic_agent][warn] SSL/TLS verifications disabled.
13:28:45.275
elastic_agent
winlog-default
[elastic_agent][info] Unit state changed winlog-default-winlog-windows-85821b12-0064-11ee-b676-af36e033a9ae (CONFIGURING->HEALTHY): Healthy
13:28:45.275
elastic_agent
winlog-default
[elastic_agent][info] Unit state changed winlog-default-winlog-system-85821b11-0064-11ee-b676-af36e033a9ae (CONFIGURING->HEALTHY): Healthy
19:17:13.888
elastic_agent
endpoint-default
[elastic_agent][info] Unit state changed endpoint-default-85821b10-0064-11ee-b676-af36e033a9ae (HEALTHY->CONFIGURING): Applied policy {85821b10-0064-11ee-b676-af36e033a9ae}
19:17:13.889
elastic_agent
endpoint-default
[elastic_agent][info] Unit state changed endpoint-default (HEALTHY->CONFIGURING): Applied policy {85821b10-0064-11ee-b676-af36e033a9ae}
19:17:33.888
elastic_agent
endpoint-default
[elastic_agent][info] Unit state changed endpoint-default-85821b10-0064-11ee-b676-af36e033a9ae (CONFIGURING->HEALTHY): Applied policy {85821b10-0064-11ee-b676-af36e033a9ae}
19:17:33.888
elastic_agent
endpoint-default
[elastic_agent][info] Unit state changed endpoint-default (CONFIGURING->HEALTHY): Applied policy {85821b10-0064-11ee-b676-af36e033a9ae}
The text was updated successfully, but these errors were encountered: