Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Elastic Agent] Properly stop subprocess when receiving SIGTERM #19567

Merged
merged 3 commits into from
Jul 6, 2020

Conversation

blakerouse
Copy link
Contributor

What does this PR do?

Fixes issue where if Elastic Agent is stopped the spawned subprocesses are not stopped. This ensures that on SIGTERM to Elastic Agent that it properly shutdowns the subprocesses, by sending the ExpectedState_Stopping over GRPC and waiting for the process to stop (kill it after 30 seconds, if it doesn't stop).

On Linux in the case that Elastic Agent is kill -9 all its spawned subprocesses will also be kill -9, without having to worry that any children processes are still handing around. Something that Windows or Mac does not support.

Why is it important?

So subprocess are not orphaned when Elastic Agent is stopped or killed.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • [] I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files
  • [ ] I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

Logs

2020-07-01T15:24:09-04:00 INFO  app.go:160      Signaling application to stop because of shutdown: metricbeat(metricbeat--8.0.0-SNAPSHOT--36643631373035623733363936343635)
2020-07-01T15:24:09-04:00 INFO  reporter.go:51  2020-07-01T15:24:09-04:00: type: 'STATE': sub_type: 'STOPPED' message: Application: filebeat--8.0.0-SNAPSHOT--36643631373035623733363936343635[a2c055bc-831f-48dd-9204-075f88cd5760]: State changed to STOPPED: Stopped
2020-07-01T15:24:09-04:00 INFO  app.go:160      Signaling application to stop because of shutdown: metricbeat(metricbeat--8.0.0-SNAPSHOT)
2020-07-01T15:24:09-04:00 INFO  reporter.go:51  2020-07-01T15:24:09-04:00: type: 'STATE': sub_type: 'STOPPED' message: Application: metricbeat--8.0.0-SNAPSHOT--36643631373035623733363936343635[a2c055bc-831f-48dd-9204-075f88cd5760]: State changed to STOPPED: Stopped
2020-07-01T15:24:12-04:00 INFO  reporter.go:51  2020-07-01T15:24:12-04:00: type: 'STATE': sub_type: 'STOPPED' message: Application: metricbeat--8.0.0-SNAPSHOT[a2c055bc-831f-48dd-9204-075f88cd5760]: State changed to STOPPED: Stopped

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jul 1, 2020
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jul 1, 2020
@blakerouse blakerouse marked this pull request as ready for review July 1, 2020 19:45
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ingest-management (Team:Ingest Management)

@blakerouse blakerouse added the bug label Jul 1, 2020
@blakerouse blakerouse self-assigned this Jul 1, 2020
@elasticmachine
Copy link
Collaborator

elasticmachine commented Jul 1, 2020

💔 Build Failed

Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Branch indexing]

  • Start Time: 2020-07-03T08:04:22.824+0000

  • Duration: 32 min 29 sec

Steps errors

Expand to view the steps failures

  • Name: Notifies GitHub of the status of a Pull Request

Log output

Expand to view the last 100 lines of log output

[2020-07-03T08:35:36.484Z] Stage "Kubernetes" skipped due to earlier failure(s)
[2020-07-03T08:35:37.487Z] Stage "Heartbeat" skipped due to earlier failure(s)
[2020-07-03T08:35:37.490Z] Stage "Libbeat" skipped due to earlier failure(s)
[2020-07-03T08:35:37.492Z] Stage "Metricbeat x-pack" skipped due to earlier failure(s)
[2020-07-03T08:35:37.493Z] Stage "Packetbeat" skipped due to earlier failure(s)
[2020-07-03T08:35:37.494Z] Stage "dockerlogbeat" skipped due to earlier failure(s)
[2020-07-03T08:35:37.510Z] Stage "Winlogbeat" skipped due to earlier failure(s)
[2020-07-03T08:35:37.511Z] Stage "Functionbeat" skipped due to earlier failure(s)
[2020-07-03T08:35:37.513Z] Stage "Journalbeat" skipped due to earlier failure(s)
[2020-07-03T08:35:37.514Z] Stage "Generators" skipped due to earlier failure(s)
[2020-07-03T08:35:43.847Z] Failed in branch Elastic Agent x-pack
[2020-07-03T08:35:43.853Z] Failed in branch Elastic Agent x-pack Windows
[2020-07-03T08:35:43.855Z] Failed in branch Elastic Agent Mac OS X
[2020-07-03T08:35:43.856Z] Failed in branch Filebeat oss
[2020-07-03T08:35:43.857Z] Failed in branch Filebeat x-pack
[2020-07-03T08:35:43.858Z] Failed in branch Filebeat Mac OS X
[2020-07-03T08:35:43.858Z] Failed in branch Filebeat x-pack Mac OS X
[2020-07-03T08:35:43.859Z] Failed in branch Filebeat Windows
[2020-07-03T08:35:43.860Z] Failed in branch Filebeat x-pack Windows
[2020-07-03T08:35:43.861Z] Failed in branch Auditbeat oss Linux
[2020-07-03T08:35:43.862Z] Failed in branch Auditbeat crosscompile
[2020-07-03T08:35:43.863Z] Failed in branch Auditbeat oss Mac OS X
[2020-07-03T08:35:43.864Z] Failed in branch Auditbeat oss Windows
[2020-07-03T08:35:43.865Z] Failed in branch Auditbeat x-pack
[2020-07-03T08:35:43.865Z] Failed in branch Auditbeat x-pack Mac OS X
[2020-07-03T08:35:43.866Z] Failed in branch Auditbeat x-pack Windows
[2020-07-03T08:35:43.867Z] Failed in branch Libbeat x-pack
[2020-07-03T08:35:43.868Z] Failed in branch Metricbeat OSS Unit tests
[2020-07-03T08:35:43.869Z] Failed in branch Metricbeat OSS Integration tests
[2020-07-03T08:35:43.870Z] Failed in branch Metricbeat Python integration tests
[2020-07-03T08:35:43.897Z] Failed in branch Metricbeat crosscompile
[2020-07-03T08:35:43.912Z] Failed in branch Metricbeat Mac OS X
[2020-07-03T08:35:43.913Z] Failed in branch Metricbeat x-pack Mac OS X
[2020-07-03T08:35:43.918Z] Failed in branch Metricbeat Windows
[2020-07-03T08:35:43.961Z] Failed in branch Metricbeat x-pack Windows
[2020-07-03T08:35:43.962Z] Failed in branch Winlogbeat Windows x-pack
[2020-07-03T08:35:43.963Z] Failed in branch Kubernetes
[2020-07-03T08:35:47.322Z] Stage "Heartbeat" skipped due to earlier failure(s)
[2020-07-03T08:35:47.324Z] Stage "Libbeat" skipped due to earlier failure(s)
[2020-07-03T08:35:47.326Z] Stage "Metricbeat x-pack" skipped due to earlier failure(s)
[2020-07-03T08:35:47.346Z] Stage "Winlogbeat" skipped due to earlier failure(s)
[2020-07-03T08:35:47.373Z] Stage "Functionbeat" skipped due to earlier failure(s)
[2020-07-03T08:35:47.375Z] Stage "Generators" skipped due to earlier failure(s)
[2020-07-03T08:35:48.209Z] Failed in branch Packetbeat
[2020-07-03T08:35:48.210Z] Failed in branch dockerlogbeat
[2020-07-03T08:35:48.211Z] Failed in branch Journalbeat
[2020-07-03T08:35:50.558Z] Stage "Heartbeat" skipped due to earlier failure(s)
[2020-07-03T08:35:50.560Z] Stage "Libbeat" skipped due to earlier failure(s)
[2020-07-03T08:35:50.561Z] Stage "Functionbeat" skipped due to earlier failure(s)
[2020-07-03T08:35:50.563Z] Stage "Generators" skipped due to earlier failure(s)
[2020-07-03T08:35:51.175Z] Failed in branch Metricbeat x-pack
[2020-07-03T08:35:51.176Z] Failed in branch Winlogbeat
[2020-07-03T08:35:52.878Z] Failed in branch Heartbeat
[2020-07-03T08:35:52.879Z] Failed in branch Libbeat
[2020-07-03T08:35:52.880Z] Failed in branch Functionbeat
[2020-07-03T08:35:52.881Z] Stage "Generators" skipped due to earlier failure(s)
[2020-07-03T08:35:53.733Z] Failed in branch Generators
[2020-07-03T08:35:54.870Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19567/src/github.com/elastic/beats
[2020-07-03T08:35:55.375Z] + find . -type f -name TEST*.xml -path */build/* -delete
[2020-07-03T08:35:55.403Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19567/src/github.com/elastic/beats/Lint
[2020-07-03T08:35:56.500Z] + cat
[2020-07-03T08:35:56.500Z] + /usr/local/bin/runbld ./runbld-script
[2020-07-03T08:35:56.500Z] Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
[2020-07-03T08:36:03.225Z] runbld>>> runbld started
[2020-07-03T08:36:03.225Z] runbld>>> 1.6.12/f45d832f2ba0aa2722ab4ec1fda8ad140f027f8b
[2020-07-03T08:36:05.893Z] runbld>>> The following profiles matched the job 'Beats/beats-beats-mbp/PR-19567' in order of occurrence in the config (last value wins).
[2020-07-03T08:36:06.874Z] runbld>>> Debug logging enabled.
[2020-07-03T08:36:06.874Z] runbld>>> Storing result
[2020-07-03T08:36:06.874Z] runbld>>> Store result: created {:total 2, :successful 2, :failed 0} 1
[2020-07-03T08:36:06.874Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1587637540455/t/20200703083606-1125EF16
[2020-07-03T08:36:06.874Z] runbld>>> Adding system facts.
[2020-07-03T08:36:07.833Z] runbld>>> Adding vcs info for the latest commit:  6576bcbae9df5f14a6d09f4104945538b2b0b1ee
[2020-07-03T08:36:07.833Z] runbld>>> >>>>>>>>>>>> SCRIPT EXECUTION BEGIN >>>>>>>>>>>>
[2020-07-03T08:36:07.833Z] runbld>>> Adding /usr/lib/jvm/java-8-openjdk-amd64/bin to the path.
[2020-07-03T08:36:08.110Z] Processing JUnit reports with runbld...
[2020-07-03T08:36:08.110Z] + echo 'Processing JUnit reports with runbld...'
[2020-07-03T08:36:08.383Z] runbld>>> <<<<<<<<<<<< SCRIPT EXECUTION END <<<<<<<<<<<<
[2020-07-03T08:36:08.383Z] runbld>>> DURATION: 32ms
[2020-07-03T08:36:08.383Z] runbld>>> STDOUT: 40 bytes
[2020-07-03T08:36:08.383Z] runbld>>> STDERR: 49 bytes
[2020-07-03T08:36:08.383Z] runbld>>> WRAPPED PROCESS: SUCCESS (0)
[2020-07-03T08:36:08.383Z] runbld>>> Searching for build metadata in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19567/src/github.com/elastic/beats
[2020-07-03T08:36:09.434Z] runbld>>> Storing build metadata: 
[2020-07-03T08:36:09.434Z] runbld>>> Adding test report.
[2020-07-03T08:36:09.434Z] runbld>>> Searching for junit test output files with the pattern: TEST-.*\.xml$ in: /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19567/src/github.com/elastic/beats
[2020-07-03T08:36:10.009Z] runbld>>> Found 0 test output files
[2020-07-03T08:36:10.010Z] runbld>>> Test output logs contained: Errors: 0 Failures: 0 Tests: 0 Skipped: 0
[2020-07-03T08:36:10.279Z] runbld>>> Storing result
[2020-07-03T08:36:10.543Z] runbld>>> Store result: updated {:total 2, :successful 2, :failed 0} 2
[2020-07-03T08:36:10.543Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1587637540455/t/20200703083606-1125EF16
[2020-07-03T08:36:10.543Z] runbld>>> Email notification disabled by environment variable.
[2020-07-03T08:36:10.543Z] runbld>>> Slack notification disabled by environment variable.
[2020-07-03T08:36:48.109Z] Running on Jenkins in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19567
[2020-07-03T08:36:48.502Z] [INFO] getVaultSecret: Getting secrets
[2020-07-03T08:36:48.854Z] Masking supported pattern matches of $VAULT_ADDR or $VAULT_ROLE_ID or $VAULT_SECRET_ID
[2020-07-03T08:36:52.145Z] + chmod 755 generate-build-data.sh
[2020-07-03T08:36:52.145Z] + ./generate-build-data.sh https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-19567/ https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-19567/runs/3 FAILURE 1948914
[2020-07-03T08:36:52.145Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-19567/runs/3/steps/?limit=10000 -o steps-info.json
[2020-07-03T08:36:55.377Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-19567/runs/3/tests/?status=FAILED -o tests-errors.json
[2020-07-03T08:36:55.627Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-19567/runs/3/log/ -o pipeline-log.txt

@blakerouse blakerouse merged commit a820842 into elastic:master Jul 6, 2020
blakerouse added a commit to blakerouse/beats that referenced this pull request Jul 6, 2020
…tic#19567)

* Implement proper shutdown so spawned subprocesses are stopped correctly when Elastic Agent is signalled to stop.

* Swap shutdown order for fleet mode.

* Reorder stop in local_mode. Add to changelog.

(cherry picked from commit a820842)
@blakerouse blakerouse deleted the agent-kill-subprocess branch July 6, 2020 18:05
michalpristas pushed a commit that referenced this pull request Jul 7, 2020
…) (#19683)

* Implement proper shutdown so spawned subprocesses are stopped correctly when Elastic Agent is signalled to stop.

* Swap shutdown order for fleet mode.

* Reorder stop in local_mode. Add to changelog.

(cherry picked from commit a820842)
melchiormoulin pushed a commit to melchiormoulin/beats that referenced this pull request Oct 14, 2020
…tic#19567)

* Implement proper shutdown so spawned subprocesses are stopped correctly when Elastic Agent is signalled to stop.

* Swap shutdown order for fleet mode.

* Reorder stop in local_mode. Add to changelog.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Elastic Agent] when Agent is stopped, Metricbeat & Filebeat are not stopped
3 participants