-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Ingest Manager] Fix failing installation on windows 7 #24387
Conversation
Pinging @elastic/ingest-management (Team:Ingest Management) |
Pinging @elastic/agent (Team:Agent) |
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
Trends 🧪💚 Flaky test reportTests succeeded. Expand to view the summary
Test stats 🧪
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an infinite loop is present.
|
||
for err != nil { | ||
backExp.Wait() | ||
err = storeAgentInfo(s, reader) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if this always fails? How does this not loop forever?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah copy paste i rewrote that part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Retry logic looks correct now.
Hi @EricDavisX Today we have validated the above fixes on upgraded Kibana from 7.10.2 to 7.12 and found the issue as fixed. Installed agent with only system integration in Agent policy: Observations:
Build details:
Thanks |
What does this PR do?
Issue described here: #24327
there was a race between enrollment process and restarting service, FS playing part as well.
The thing was that when agent was restarted only on windows 7 it loaded standalone ID, even though it was already replaced by enrollment process.
Then when agent retrieved hosts from fleet it even overwrote updated ID with stale one.
This fix adds a lock which prevents simultaneous write in between these two processes and a forced Reload in case of fleet managed agent later in the cycle.
Another thing is FSync after file rotation which was missing for windows.
These seems to fix the issue, tested on win 7 VM on cloud
Why is it important?
Fixes #24327
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.