Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ingest Manager] Restart process on output change #24907

Merged
merged 10 commits into from
Apr 19, 2021

Conversation

michalpristas
Copy link
Contributor

What does this PR do?

This PR adds an option to specify in a spec file behavior on output change as sometimes new output definition is not reloaded correctly on beats side.

Endpoint reloads output ok, fleet server does not reload output at all but we agreed to fix it on server side.

Why is it important?

#23596

#24538

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

@michalpristas michalpristas self-assigned this Apr 1, 2021
@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Apr 1, 2021
@michalpristas michalpristas changed the title [Ingest Manager] Restart on output [Ingest Manager] Restart process on output change Apr 1, 2021
@elasticmachine
Copy link
Collaborator

elasticmachine commented Apr 1, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: Pull request #24907 updated

  • Start Time: 2021-04-19T15:27:01.695+0000

  • Duration: 70 min 27 sec

  • Commit: 412c2bc

Test stats 🧪

Test Results
Failed 0
Passed 6788
Skipped 16
Total 6804

Trends 🧪

Image of Build Times

Image of Tests

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 6788
Skipped 16
Total 6804

@michalpristas michalpristas marked this pull request as ready for review April 8, 2021 10:07
@elasticmachine
Copy link
Collaborator

Pinging @elastic/agent (Team:Agent)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ingest-management (Team:Ingest Management)

@mergify
Copy link
Contributor

mergify bot commented Apr 8, 2021

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b restart-on-output upstream/restart-on-output
git merge upstream/master
git push upstream restart-on-output

@mergify
Copy link
Contributor

mergify bot commented Apr 14, 2021

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b restart-on-output upstream/restart-on-output
git merge upstream/master
git push upstream restart-on-output

Copy link
Contributor

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change/code looks good, but I still don't like this idea.

Where we going to actually land this approach for 7.13? Have we noticed any side-effects with making this change?

@ph
Copy link
Contributor

ph commented Apr 14, 2021

@blakerouse You don't like the idea of restarting the process? @urso

@blakerouse
Copy link
Contributor

I would greatly prefer to fix the real issue versus a hack to make it work.

@urso
Copy link

urso commented Apr 19, 2021

I would greatly prefer to fix the real issue versus a hack to make it work.

The output reloading did never work without issues and should not have been used by agent in the first place. This is why I created: #24538

Although I would love to have the reloading issue fixed in Beats, it is no easy fix and we did try a few times in the past. But in order to fix the issue properly we would have to introduce some bigger architectural changes to libbeat, which was decided to postpone. The race condition in the reloading can lead to events not being removed from the memory/disk queue, which will eventually lead to a deadlock (in any Beat).

Although restarting is quite intrusive, we do not expect this to happen every now and so often, right?

@urso
Copy link

urso commented Apr 19, 2021

@michalpristas Have you tried if you can reproduce the issue #23596 before/after the change?

@michalpristas
Copy link
Contributor Author

@michalpristas Have you tried if you can reproduce the issue #23596 before/after the change?

havent tried but i was not able to repro even before

@michalpristas michalpristas merged commit 486889e into elastic:master Apr 19, 2021
@blakerouse
Copy link
Contributor

@urso With the new permissions work, any time a new integration is added to a policy that requires a new index or a new permission it will be marked as a new output and so the beats will be restarted.

@urso
Copy link

urso commented Apr 20, 2021

With the new permissions work, any time a new integration is added to a policy that requires a new index or a new permission it will be marked as a new output and so the beats will be restarted.

Yes, I'm aware of that. Getting your environment configured is somewhat bumpy right now. But once you are settled with your setup, reconfigurations of API keys should be an exception.

v1v added a commit to v1v/beats that referenced this pull request Apr 20, 2021
…-github-pr-comment-template

* upstream/master:
  [Ingest Manager] Keep http and logging config during enroll (elastic#25132)
  Refactor kubernetes autodiscover to avoid skipping short-living pods (elastic#24742)
  [libbeat] New decode xml wineventlog processor (elastic#25115)
  Add svc to agent k8s clusterRole (elastic#25146)
  Add awsfargate module to collect container logs from Amazon ECS on Fargate (elastic#25041)
  [Filebeat][Cisco ASA] log enhancement and performance (elastic#24744)
  Watch kubernetes namespaces for autodiscover metadata for pods (elastic#25117)
  Cyberark Privileged Access Security module (elastic#24803)
  [Elastic Agent] Log the container command output with LOGS_PATH (elastic#25150)
  Fix for tests after `device...` field has been removed (elastic#25141)
  [Ingest Manager] Restart process on output change (elastic#24907)
  Set --insecure in container when FLEET_SERVER_ENABLE and FLEET_INSECURE set. (elastic#25137)
  [filebeat] Update documentation / changelog / beta warnings for the syslog input (elastic#25047)
  Add support for ignore_inactive in filestream input (elastic#25036)
  Fix bug with annotations dedot config on k8s not used (elastic#25111)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants