-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fleet] System integration inputs never configured for agent installed on self-managed 8.13.0. #177372
Comments
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
@manishgupta-qasource Please review. |
Secondary review for this ticket is Done |
There are no inputs in the agent policy: agent:
download:
sourceURI: https://artifacts.elastic.co/downloads/
features: null
monitoring:
enabled: true
logs: true
metrics: true
namespace: agent
use_output: default
protection:
enabled: false
signing_key: <REDACTED>
uninstall_token_hash: <REDACTED>
fleet:
hosts:
- https://ec2-54-234-236-16.compute-1.amazonaws.com:8220
host:
id: c083c82af2fe42ff8b8d4e1e93a4e604
id: 33a65a41-50a2-4d13-b211-730758430ca5
outputs:
default:
api_key: <REDACTED>
hosts:
- https://172.31.29.80:9200
preset: balanced
ssl:
ca_trusted_fingerprint: <REDACTED>
type: elasticsearch
path:
config: /opt/Elastic/Agent
data: /opt/Elastic/Agent/data
home: /opt/Elastic/Agent/data/elastic-agent-8.13.0-d2c0b8
logs: /opt/Elastic/Agent
revision: 1
runtime:
arch: amd64
native_arch: ""
os: linux
osinfo:
family: suse
major: 15
minor: 0
patch: 0
type: linux
version: 15-SP5
signed:
data: eyJpZCI6IjMzYTY1YTQxLTUwYTItNGQxMy1iMjExLTczMDc1ODQzMGNhNSIsImFnZW50Ijp7ImZlYXR1cmVzIjp7fSwicHJvdGVjdGlvbiI6eyJlbmFibGVkIjpmYWxzZSwidW5pbnN0YWxsX3Rva2VuX2hhc2giOiJyRDZSWGtveUZIZ2s5TktpbzUvUTMzTXZ2ZENQL1VOQlRSRjRIYS9BTHFJPSIsInNpZ25pbmdfa2V5IjoiTUZrd0V3WUhLb1pJemowQ0FRWUlLb1pJemowREFRY0RRZ0FFK1gwWk1yVWRqUy9WMzNvWnJqQnA5YVNOWFRQZkRFdnVhSUxwY0ltYSsvYlpqWDNMOWJ0SzRhVmI2ZG5sU2Rrc09SZkR3WitlZjlUYjJmUldtalBNb1E9PSJ9fSwiaW5wdXRzIjpbXX0=
signature: MEUCIQD6l5wP2yomZuUe8Jh06h35X3oV+XfJpCPto5WovevasQIgc9dI0UdRh8znkDIqGQ49YRMqwgGZ/ztWKBb5fhlTDiw= I don't see a policy change action in the logs. I think I saw something similar earlier with an agent I deployed locally but it wasn't always reproducable. Transferring to Fleet since I think the agent is sitting here waiting for a policy change to happen. I see the SETTINGS action changing the log level was received fairly early so we must have been able to check in at least once after being enrolled. {"log.level":"debug","@timestamp":"2024-02-20T11:38:14.027Z","log.origin":{"file.name":"dispatcher/dispatcher.go","file.line":163},"message":"Successfully dispatched action: 'action_id: c47fcf6f-a413-4adf-9f2d-b7f577e7ddf9, type: SETTINGS, log_level: debug'","log":{"source":"elastic-agent"},"ecs.version":"1.6.0"} |
Pinging @elastic/fleet (Team:Fleet) |
I can't reproduce the issue of missing input in agents, tried a few combinations: fleet-server on Mac, agent on linux, both on linux. |
Hi @amolnater-qasource Do you still have the self-managed env running? |
Thank you for looking into this issue. Yes we have the environment running at our end.
Yes, logs for only that particular agent were visible under Discover.[Before uninstalling] Amol.Self-Win.1.-.ec2-34-227-192-205.compute-1.amazonaws.com.-.Remote.Desktop.Connection.2024-02-21.17-18-22.mp4
We have uninstalled then reinstalled the agent, and observed that the issue is resolved. Please let us know if anything else is required from our end. |
Thanks Amol. I wasn't specific enough with the Discover request, I wanted to see if there is any data for the @amolnater-qasource Are you able to consistency reproduce, maybe on other platforms, e.g. Mac/Linux Fleet-Server or ECS? It would help to capture the Fleet-server logs (capture agent diagnostics from fleet-server agent). |
@amolnater-qasource @juliaElastic If we have access to the environement here it will be great to take a look at the content of the |
We have revalidated this issue on fresh self-managed setup for 8.13.0 BC1 on a different AWS-Windows 2022 server(where actual issue was observed). The issue is no longer reproducible at our end and we can close this issue for now till the time we observe this again. Please let us know if anything else is required from our end. |
I have a suspicion that this issue is caused by some kind of race condition in Fleet, since it is hard to reproduce, but came up a few times in the past week. |
@nchaulet Please find attached |
Thanks Amol, I had a look and found that there is a doc without
|
We have fetched logs available under elasticsearch/logs folder and the helpful logs might be in the 20th February, 2024 initial logs when the policy was created and the agent was enrolled. There are no logs under kibana/logs folder, could you please share if there's any other location where we can get the logs. |
Actually I don't need the logs, I could reproduce the issue of I think the reason why the issue doesn't always happen on agent side, is because Fleet-server randomly queries the doc with the latest revision with or without outputs. I'll keep investigating to find out where the bug is and find a fix. |
@juliaElastic I took a look too to the |
Good catch, I also found that kibana logic deploys the policy twice when creating with a package policy. Both are saved with revision 1, first without inputs, then with the inputs coming from the package policy. |
Yes it's a probably safer |
After discussing with Kyle, decided that instead of bumping |
…gent policy with system integration (#177594) ## Summary Closes #177372 When creating an agent policy with a package policy immediately (e.g. system integration), the `deployPolicy` logic was called once, creating a doc in `.fleet-policies` with `revision:1` without `inputs`, and then updating the doc with `inputs`, still on `revision:1`. This is causing an intermittent issue on the agents, if Fleet-server picks up the first document, and delivers to agent without `inputs`. As a fix, added an option to skip `deploPolicy` when called from the `createAgentPolicyWithPackages` function, as the policy will be deployed after creating the package policies. To verify: - create an agent policy with system monitoring (default option) - check that the created documents in `.fleet-policies` are correct: there should be one doc with `revision_idx:1` and `coordinator_idx:0` (created by Fleet API), and one doc with `revision_idx:1` and `coordinator_idx:1` (created by fleet-server) - verify that both documents have `data.inputs` field populated Used this query to verify: ``` POST .fleet-policies/_search { "query": { "bool": { "must": [ { "term": {"coordinator_idx": 0} } ], "filter": { "term": { "policy_id": "<agent policy id>" } } } }, "_source": [ "revision_idx","coordinator_idx", "policy_id", "@timestamp", "data.inputs" ], "sort": [ { "revision_idx": { "order": "desc" } } ] } ``` ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
…gent policy with system integration (elastic#177594) ## Summary Closes elastic#177372 When creating an agent policy with a package policy immediately (e.g. system integration), the `deployPolicy` logic was called once, creating a doc in `.fleet-policies` with `revision:1` without `inputs`, and then updating the doc with `inputs`, still on `revision:1`. This is causing an intermittent issue on the agents, if Fleet-server picks up the first document, and delivers to agent without `inputs`. As a fix, added an option to skip `deploPolicy` when called from the `createAgentPolicyWithPackages` function, as the policy will be deployed after creating the package policies. To verify: - create an agent policy with system monitoring (default option) - check that the created documents in `.fleet-policies` are correct: there should be one doc with `revision_idx:1` and `coordinator_idx:0` (created by Fleet API), and one doc with `revision_idx:1` and `coordinator_idx:1` (created by fleet-server) - verify that both documents have `data.inputs` field populated Used this query to verify: ``` POST .fleet-policies/_search { "query": { "bool": { "must": [ { "term": {"coordinator_idx": 0} } ], "filter": { "term": { "policy_id": "<agent policy id>" } } } }, "_source": [ "revision_idx","coordinator_idx", "policy_id", "@timestamp", "data.inputs" ], "sort": [ { "revision_idx": { "order": "desc" } } ] } ``` ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios (cherry picked from commit 5f17b39)
…a new agent policy with system integration (#177594) (#177725) # Backport This will backport the following commits from `main` to `8.13`: - [[Fleet] Fix issue of agent sometimes not getting inputs using a new agent policy with system integration (#177594)](#177594) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Julia Bardi","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-02-23T14:46:48Z","message":"[Fleet] Fix issue of agent sometimes not getting inputs using a new agent policy with system integration (#177594)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/177372\r\n\r\nWhen creating an agent policy with a package policy immediately (e.g.\r\nsystem integration), the `deployPolicy` logic was called once, creating\r\na doc in `.fleet-policies` with `revision:1` without `inputs`, and then\r\nupdating the doc with `inputs`, still on `revision:1`.\r\nThis is causing an intermittent issue on the agents, if Fleet-server\r\npicks up the first document, and delivers to agent without `inputs`.\r\nAs a fix, added an option to skip `deploPolicy` when called from the\r\n`createAgentPolicyWithPackages` function, as the policy will be deployed\r\nafter creating the package policies.\r\n\r\nTo verify:\r\n- create an agent policy with system monitoring (default option)\r\n- check that the created documents in `.fleet-policies` are correct:\r\nthere should be one doc with `revision_idx:1` and `coordinator_idx:0`\r\n(created by Fleet API), and one doc with `revision_idx:1` and\r\n`coordinator_idx:1` (created by fleet-server)\r\n- verify that both documents have `data.inputs` field populated\r\n\r\nUsed this query to verify:\r\n```\r\nPOST .fleet-policies/_search\r\n{\r\n \"query\": {\r\n \"bool\": {\r\n \"must\": [\r\n {\r\n \"term\": {\"coordinator_idx\": 0}\r\n }\r\n ],\r\n \"filter\": {\r\n \"term\": {\r\n \"policy_id\": \"<agent policy id>\"\r\n }\r\n }\r\n }\r\n }, \r\n \"_source\": [\r\n \"revision_idx\",\"coordinator_idx\", \"policy_id\", \"@timestamp\", \"data.inputs\"\r\n ],\r\n \"sort\": [\r\n {\r\n \"revision_idx\": {\r\n \"order\": \"desc\"\r\n }\r\n }\r\n ]\r\n}\r\n```\r\n\r\n\r\n### Checklist\r\n\r\n- [x] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated or added to match the most common scenarios","sha":"5f17b39a1d4aa326f8b75bc0d2375f620433e9be","branchLabelMapping":{"^v8.14.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:fix","Team:Fleet","backport:prev-minor","v8.14.0"],"title":"[Fleet] Fix issue of agent sometimes not getting inputs using a new agent policy with system integration","number":177594,"url":"https://github.com/elastic/kibana/pull/177594","mergeCommit":{"message":"[Fleet] Fix issue of agent sometimes not getting inputs using a new agent policy with system integration (#177594)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/177372\r\n\r\nWhen creating an agent policy with a package policy immediately (e.g.\r\nsystem integration), the `deployPolicy` logic was called once, creating\r\na doc in `.fleet-policies` with `revision:1` without `inputs`, and then\r\nupdating the doc with `inputs`, still on `revision:1`.\r\nThis is causing an intermittent issue on the agents, if Fleet-server\r\npicks up the first document, and delivers to agent without `inputs`.\r\nAs a fix, added an option to skip `deploPolicy` when called from the\r\n`createAgentPolicyWithPackages` function, as the policy will be deployed\r\nafter creating the package policies.\r\n\r\nTo verify:\r\n- create an agent policy with system monitoring (default option)\r\n- check that the created documents in `.fleet-policies` are correct:\r\nthere should be one doc with `revision_idx:1` and `coordinator_idx:0`\r\n(created by Fleet API), and one doc with `revision_idx:1` and\r\n`coordinator_idx:1` (created by fleet-server)\r\n- verify that both documents have `data.inputs` field populated\r\n\r\nUsed this query to verify:\r\n```\r\nPOST .fleet-policies/_search\r\n{\r\n \"query\": {\r\n \"bool\": {\r\n \"must\": [\r\n {\r\n \"term\": {\"coordinator_idx\": 0}\r\n }\r\n ],\r\n \"filter\": {\r\n \"term\": {\r\n \"policy_id\": \"<agent policy id>\"\r\n }\r\n }\r\n }\r\n }, \r\n \"_source\": [\r\n \"revision_idx\",\"coordinator_idx\", \"policy_id\", \"@timestamp\", \"data.inputs\"\r\n ],\r\n \"sort\": [\r\n {\r\n \"revision_idx\": {\r\n \"order\": \"desc\"\r\n }\r\n }\r\n ]\r\n}\r\n```\r\n\r\n\r\n### Checklist\r\n\r\n- [x] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated or added to match the most common scenarios","sha":"5f17b39a1d4aa326f8b75bc0d2375f620433e9be"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v8.14.0","branchLabelMappingKey":"^v8.14.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/177594","number":177594,"mergeCommit":{"message":"[Fleet] Fix issue of agent sometimes not getting inputs using a new agent policy with system integration (#177594)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/177372\r\n\r\nWhen creating an agent policy with a package policy immediately (e.g.\r\nsystem integration), the `deployPolicy` logic was called once, creating\r\na doc in `.fleet-policies` with `revision:1` without `inputs`, and then\r\nupdating the doc with `inputs`, still on `revision:1`.\r\nThis is causing an intermittent issue on the agents, if Fleet-server\r\npicks up the first document, and delivers to agent without `inputs`.\r\nAs a fix, added an option to skip `deploPolicy` when called from the\r\n`createAgentPolicyWithPackages` function, as the policy will be deployed\r\nafter creating the package policies.\r\n\r\nTo verify:\r\n- create an agent policy with system monitoring (default option)\r\n- check that the created documents in `.fleet-policies` are correct:\r\nthere should be one doc with `revision_idx:1` and `coordinator_idx:0`\r\n(created by Fleet API), and one doc with `revision_idx:1` and\r\n`coordinator_idx:1` (created by fleet-server)\r\n- verify that both documents have `data.inputs` field populated\r\n\r\nUsed this query to verify:\r\n```\r\nPOST .fleet-policies/_search\r\n{\r\n \"query\": {\r\n \"bool\": {\r\n \"must\": [\r\n {\r\n \"term\": {\"coordinator_idx\": 0}\r\n }\r\n ],\r\n \"filter\": {\r\n \"term\": {\r\n \"policy_id\": \"<agent policy id>\"\r\n }\r\n }\r\n }\r\n }, \r\n \"_source\": [\r\n \"revision_idx\",\"coordinator_idx\", \"policy_id\", \"@timestamp\", \"data.inputs\"\r\n ],\r\n \"sort\": [\r\n {\r\n \"revision_idx\": {\r\n \"order\": \"desc\"\r\n }\r\n }\r\n ]\r\n}\r\n```\r\n\r\n\r\n### Checklist\r\n\r\n- [x] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated or added to match the most common scenarios","sha":"5f17b39a1d4aa326f8b75bc0d2375f620433e9be"}}]}] BACKPORT--> Co-authored-by: Julia Bardi <[email protected]>
…gent policy with system integration (elastic#177594) ## Summary Closes elastic#177372 When creating an agent policy with a package policy immediately (e.g. system integration), the `deployPolicy` logic was called once, creating a doc in `.fleet-policies` with `revision:1` without `inputs`, and then updating the doc with `inputs`, still on `revision:1`. This is causing an intermittent issue on the agents, if Fleet-server picks up the first document, and delivers to agent without `inputs`. As a fix, added an option to skip `deploPolicy` when called from the `createAgentPolicyWithPackages` function, as the policy will be deployed after creating the package policies. To verify: - create an agent policy with system monitoring (default option) - check that the created documents in `.fleet-policies` are correct: there should be one doc with `revision_idx:1` and `coordinator_idx:0` (created by Fleet API), and one doc with `revision_idx:1` and `coordinator_idx:1` (created by fleet-server) - verify that both documents have `data.inputs` field populated Used this query to verify: ``` POST .fleet-policies/_search { "query": { "bool": { "must": [ { "term": {"coordinator_idx": 0} } ], "filter": { "term": { "policy_id": "<agent policy id>" } } } }, "_source": [ "revision_idx","coordinator_idx", "policy_id", "@timestamp", "data.inputs" ], "sort": [ { "revision_idx": { "order": "desc" } } ] } ``` ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
Hi Team, We have revalidated this issue on the latest 8.13.0 BC5 Self-managed Kibana environment and found it fixed now. Observations:
Build details: Hence, we are marking this issue as QA: Validated. Thanks |
Kibana Build details:
Host OS: Windows- self-managed, Linux secondary agent
Preconditions:
Steps to reproduce:
Screen Recording:
Amol.Self-Win.-.ec2-54-234-236-16.compute-1.amazonaws.com.-.Remote.Desktop.Connection.2024-02-20.17-09-01.mp4
Expected Result:
System integration data should be available for secondary agent installed on self-managed 8.13.0.
Logs:
elastic-agent-diagnostics-2024-02-20T11-39-45Z-00.zip
The text was updated successfully, but these errors were encountered: