Add a GRPC listener service for Agent #18827

blakerouse · 2020-05-28T18:48:45Z

What does this PR do?

Adds a GRPC server implementation to the Elastic Agent. This is just the implementation, the server is not actually used by the Elastic Agent (coming in later PR).

The GRPC server maintains the currently reported status of an application (connected or not connected). Pushes config updates to the application and informs the application when to stop. A watchdog is included in the server to ensure that the application checkin every 30 seconds if not then the first missed window of time the application will be marked degraded and then after another missed window (total of 60 seconds) the application will be marked failed (currently nothing is done at this point, follow up PR will add the kill/restart logic).

Actions are also handled by the GRPC server implementation, even across connections and disconnections, including timeout of operations. A action can timeout or be cancelled depending on the application state in the GRPC server.

Usage:

type StubHandler struct{}

func (h *StubHandler) OnStatusChange(as *ApplicationState, status proto.StateObserved_Status, message string) {
	// handle status changes
}

srv, _ := server.New(logger, ":6890", &StubHandler{})
_ = srv.Start()

app := application.New(...)
as, _ := srv.Register(app)

as.UpdateConfig("new_config")

resp, err := as.PerformAction("name", map[string]interface{}{}, 30 * time.Second)  // 30 seconds to perform action

as.Stop(30 * time.Second) // 30 seconds to stop

as.Destroy()  // Remove application from server, prevent application from re-connect, and do signal stop

Why is it important?

This is need as the contract between Elastic Agent and the spawned applications has flipped where the applications now connecting back to the Agent. Support for stopping and performing actions on application was also required this PR adds those required building blocks.

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
~~I have made corresponding changes to the documentation~~
~~I have made corresponding change to the default configuration files~~
I have added tests that prove my fix is effective or that my feature works
~~I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.~~

Author's Checklist

Unit tests pass with data race checking go test -race github.com/elastic/beats/v7/x-pack/elastic-agent/pkg/core/server

How to test this PR locally

go test -race github.com/elastic/beats/v7/x-pack/elastic-agent/pkg/core/server

Related issues

elasticmachine · 2020-05-28T18:48:47Z

Pinging @elastic/ingest-management (Team:Ingest Management)

ph · 2020-05-28T19:05:38Z

@graphaelli I believe you are using gRPC in apm-server, we are planning to update the vendor version on our side. Not sure if that has an impact on you or not?

elasticmachine · 2020-05-28T19:06:00Z

💔 Tests Failed

Expand to view the summary

Build stats

Build Cause: [Pull request #18827 updated]
Start Time: 2020-06-01T18:45:18.158+0000
Duration: 72 min 14 sec

Test stats 🧪

Test	Results
Failed	1
Passed	8851
Skipped	1450
Total	10302

Test errors

Expand to view the tests failures

Name: Build and Test / Metricbeat OSS Unit tests / test_process – test_system.Test
- Age: 1
- Duration: 1.656
- Error Details: False is not true : fd not found in any process events

Steps errors

Expand to view the steps failures

Name: Mage build unitTest
- Description: mage build unitTest
- Duration: 11 min 48 sec
- Start Time: 2020-06-01T19:08:59.993+0000
- log
Name: Mage goIntegTest
- Description: mage goIntegTest
- Duration: 1 min 8 sec
- Start Time: 2020-06-01T19:08:28.724+0000
- log
Name: Report to Codecov
- Description: curl -sSLo codecov https://codecov.io/bash for i in auditbeat filebeat heartbeat libbeat metricbeat packetbeat winlogbeat journalbeat do FILE="${i}/build/coverage/full.cov" if [ -f "${FILE}" ]; then bash codecov -f "${FILE}" fi done
- Duration: 2 min 22 sec
- Start Time: 2020-06-01T19:45:11.948+0000
- log
Name: Report to Codecov
- Description: curl -sSLo codecov https://codecov.io/bash for i in auditbeat filebeat heartbeat libbeat metricbeat packetbeat winlogbeat journalbeat do FILE="${i}/build/coverage/full.cov" if [ -f "${FILE}" ]; then bash codecov -f "${FILE}" fi done
- Duration: 1 min 27 sec
- Start Time: 2020-06-01T19:48:32.253+0000
- log

Log output

Expand to view the last 100 lines of log output

[2020-06-01T19:57:04.199Z] + [ -f heartbeat/build/coverage/full.cov ]
[2020-06-01T19:57:04.199Z] + FILE=libbeat/build/coverage/full.cov
[2020-06-01T19:57:04.199Z] + [ -f libbeat/build/coverage/full.cov ]
[2020-06-01T19:57:04.199Z] + FILE=metricbeat/build/coverage/full.cov
[2020-06-01T19:57:04.199Z] + [ -f metricbeat/build/coverage/full.cov ]
[2020-06-01T19:57:04.199Z] + FILE=packetbeat/build/coverage/full.cov
[2020-06-01T19:57:04.199Z] + [ -f packetbeat/build/coverage/full.cov ]
[2020-06-01T19:57:04.199Z] + FILE=winlogbeat/build/coverage/full.cov
[2020-06-01T19:57:04.199Z] + [ -f winlogbeat/build/coverage/full.cov ]
[2020-06-01T19:57:04.199Z] + FILE=journalbeat/build/coverage/full.cov
[2020-06-01T19:57:04.199Z] + [ -f journalbeat/build/coverage/full.cov ]
[2020-06-01T19:57:05.676Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats
[2020-06-01T19:57:05.999Z] + find . -type f -name TEST*.xml -path */build/* -delete
[2020-06-01T19:57:06.013Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Lint
[2020-06-01T19:57:06.092Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Metricbeat-OSS-Integration-tests
[2020-06-01T19:57:06.164Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Elastic-Agent-Mac-OS-X
[2020-06-01T19:57:06.236Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Elastic-Agent-x-pack
[2020-06-01T19:57:06.311Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Winlogbeat-oss
[2020-06-01T19:57:06.388Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Auditbeat-crosscompile
[2020-06-01T19:57:06.463Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Auditbeat-oss-Mac-OS-X
[2020-06-01T19:57:06.544Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Auditbeat-x-pack-Mac-OS-X
[2020-06-01T19:57:06.622Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Dockerlogbeat
[2020-06-01T19:57:06.704Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Journalbeat-oss
[2020-06-01T19:57:06.787Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Generators-Metricbeat-Linux
[2020-06-01T19:57:06.864Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Filebeat-Mac-OS-X
[2020-06-01T19:57:06.937Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Functionbeat-x-pack
[2020-06-01T19:57:07.014Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Elastic-Agent-x-pack-Windows
[2020-06-01T19:57:07.091Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Filebeat-x-pack-Mac-OS-X
[2020-06-01T19:57:07.175Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Metricbeat-Mac-OS-X
[2020-06-01T19:57:07.247Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Metricbeat-x-pack-Mac-OS-X
[2020-06-01T19:57:07.323Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Metricbeat-OSS-Unit-tests
[2020-06-01T19:57:07.393Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Heartbeat-oss
[2020-06-01T19:57:07.465Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Auditbeat-x-pack
[2020-06-01T19:57:07.541Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Auditbeat-oss-Windows
[2020-06-01T19:57:07.615Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Metricbeat-crosscompile
[2020-06-01T19:57:07.691Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Functionbeat-Mac-OS-X-x-pack
[2020-06-01T19:57:07.763Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Auditbeat-x-pack-Windows
[2020-06-01T19:57:07.835Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Filebeat-x-pack-Windows
[2020-06-01T19:57:07.905Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Winlogbeat-Windows-x-pack
[2020-06-01T19:57:07.976Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Heartbeat-Mac-OS-X
[2020-06-01T19:57:08.049Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Auditbeat-oss-Linux
[2020-06-01T19:57:08.118Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Packetbeat-oss
[2020-06-01T19:57:08.193Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Libbeat-x-pack
[2020-06-01T19:57:08.276Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Filebeat-Windows
[2020-06-01T19:57:08.351Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Metricbeat-Windows
[2020-06-01T19:57:08.422Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Winlogbeat-Windows
[2020-06-01T19:57:08.505Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Metricbeat-x-pack-Windows
[2020-06-01T19:57:08.579Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Filebeat-x-pack
[2020-06-01T19:57:08.651Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Generators-Beat-Linux
[2020-06-01T19:57:08.724Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Metricbeat-x-pack
[2020-06-01T19:57:08.802Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Filebeat-oss
[2020-06-01T19:57:08.873Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Heartbeat-Windows
[2020-06-01T19:57:08.951Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Functionbeat-Windows
[2020-06-01T19:57:09.022Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Metricbeat-Python-integration-tests
[2020-06-01T19:57:09.093Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Generators-Metricbeat-Mac-OS-X
[2020-06-01T19:57:09.162Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Libbeat-oss
[2020-06-01T19:57:09.231Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Generators-Beat-Mac-OS-X
[2020-06-01T19:57:09.301Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Libbeat-crosscompile
[2020-06-01T19:57:09.369Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats/Libbeat-stress-tests
[2020-06-01T19:57:09.727Z] + cat
[2020-06-01T19:57:09.727Z] + /usr/local/bin/runbld ./runbld-script
[2020-06-01T19:57:09.727Z] Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
[2020-06-01T19:57:16.330Z] runbld>>> runbld started
[2020-06-01T19:57:16.330Z] runbld>>> 1.6.11/a66728ff8f4356963772e6e6d2069392fa06acbe
[2020-06-01T19:57:18.248Z] runbld>>> The following profiles matched the job 'Beats/beats-beats-mbp/PR-18827' in order of occurrence in the config (last value wins).
[2020-06-01T19:57:19.632Z] runbld>>> Debug logging enabled.
[2020-06-01T19:57:19.632Z] runbld>>> Storing result
[2020-06-01T19:57:19.632Z] runbld>>> Store result: created {:total 2, :successful 2, :failed 0} 1
[2020-06-01T19:57:19.632Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1587637540455/t/20200601195719-73A36373
[2020-06-01T19:57:19.632Z] runbld>>> Adding system facts.
[2020-06-01T19:57:20.576Z] runbld>>> Adding vcs info for the latest commit:  c699fae0caa204fffcbdde607ef5595467e15eb5
[2020-06-01T19:57:20.837Z] runbld>>> >>>>>>>>>>>> SCRIPT EXECUTION BEGIN >>>>>>>>>>>>
[2020-06-01T19:57:20.837Z] runbld>>> Adding /usr/lib/jvm/java-8-openjdk-amd64/bin to the path.
[2020-06-01T19:57:20.837Z] + echo 'Processing JUnit reports with runbld...'
[2020-06-01T19:57:20.837Z] Processing JUnit reports with runbld...
[2020-06-01T19:57:21.410Z] runbld>>> <<<<<<<<<<<< SCRIPT EXECUTION END <<<<<<<<<<<<
[2020-06-01T19:57:21.410Z] runbld>>> DURATION: 26ms
[2020-06-01T19:57:21.410Z] runbld>>> STDOUT: 40 bytes
[2020-06-01T19:57:21.410Z] runbld>>> STDERR: 49 bytes
[2020-06-01T19:57:21.410Z] runbld>>> WRAPPED PROCESS: SUCCESS (0)
[2020-06-01T19:57:21.410Z] runbld>>> Searching for build metadata in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats
[2020-06-01T19:57:22.797Z] runbld>>> Storing build metadata: 
[2020-06-01T19:57:22.797Z] runbld>>> Adding test report.
[2020-06-01T19:57:22.797Z] runbld>>> Searching for junit test output files with the pattern: TEST-.*\.xml$ in: /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827/src/github.com/elastic/beats
[2020-06-01T19:57:23.742Z] runbld>>> Found 57 test output files
[2020-06-01T19:57:25.666Z] runbld>>> Test output logs contained: Errors: 0 Failures: 1 Tests: 10152 Skipped: 1335
[2020-06-01T19:57:25.929Z] runbld>>> Storing result
[2020-06-01T19:57:25.929Z] runbld>>> FAILURES: 1
[2020-06-01T19:57:26.192Z] runbld>>> Store result: updated {:total 2, :successful 2, :failed 0} 2
[2020-06-01T19:57:26.192Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1587637540455/t/20200601195719-73A36373
[2020-06-01T19:57:26.192Z] runbld>>> Email notification disabled by environment variable.
[2020-06-01T19:57:26.192Z] runbld>>> Slack notification disabled by environment variable.
[2020-06-01T19:57:31.701Z] Running on Jenkins in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18827
[2020-06-01T19:57:31.802Z] [INFO] getVaultSecret: Getting secrets
[2020-06-01T19:57:31.853Z] Masking supported pattern matches of $VAULT_ADDR or $VAULT_ROLE_ID or $VAULT_SECRET_ID
[2020-06-01T19:57:32.494Z] + chmod 755 generate-build-data.sh
[2020-06-01T19:57:32.494Z] + ./generate-build-data.sh https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-18827/ https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-18827/runs/5 FAILURE 4334075
[2020-06-01T19:57:32.494Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-18827/runs/5/steps/?limit=10000 -o steps-info.json
[2020-06-01T19:57:33.837Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-18827/runs/5/tests/?status=FAILED -o tests-errors.json
[2020-06-01T19:57:34.388Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-18827/runs/5/log/ -o pipeline-log.txt

blakerouse · 2020-05-28T20:09:52Z

This is actually dependent on #18829, because of the usage of tls.ClientHelloInfo. SupportsCertificate.

graphaelli · 2020-05-28T20:23:24Z

Thanks for the heads up, broadening to @elastic/apm-server

axw · 2020-05-29T03:25:25Z

Thanks for the heads up @ph. I just tested updating apm-server's grpc to v1.29.1, and it appears to be fine.

michalpristas

lot of code but i like it. just a small questions along the way

michalpristas · 2020-05-29T11:54:11Z

x-pack/elastic-agent/pkg/core/server/server.go

+	statusMessage     string
+	statusConfigIdx   uint64
+	statusTime        time.Time
+	checkinConn       bool


could you think about different name please? i imagine a connection itself under this name, something like isCheckingConnected, hasCheckingConnection...

michalpristas · 2020-05-29T11:54:20Z

x-pack/elastic-agent/pkg/core/server/server.go

+
+	pendingActions chan *pendingAction
+	sentActions    map[string]*sentAction
+	actionsConn    bool


michalpristas · 2020-05-29T12:23:06Z

x-pack/elastic-agent/pkg/core/server/server.go

+		sentActions:       make(map[string]*sentAction),
+		actionsConn:       true,
+	}
+	s.lock.Lock()


can it be that in between get and set another set happens? in agent we do thing sync so it should be fine

michalpristas · 2020-05-29T12:24:31Z

x-pack/elastic-agent/pkg/core/server/server.go

+func (s *Server) Checkin(server proto.ElasticAgent_CheckinServer) error {
+	firstCheckinChan := make(chan *proto.StateObserved)
+	go func() {
+		// go func will not be leaked, because when the main function


👍 for comments about when go routine is done

michalpristas · 2020-05-29T12:44:19Z

x-pack/elastic-agent/pkg/core/server/server.go

+			if err != nil {
+				// failed to send action; add back to channel to retry on re-connect from the client
+				appState.actionsLock.Unlock()
+				appState.pendingActions <- pending


food for thought: can out of order application of actions be an issue? [not a blocker]

I don't see why they would be out of order we append them to the pendingAction channel?

you have A1, A2, A3,
A1 fails so it is sent to pendingActions channel which is buffered up to 100 items. then you will proceed with A2 and A3 and then A1 is there again from channel

@michalpristas is correct on the ordering. But its not an issue because the actions on the client side are not blocking when it comes from reading from the stream.

So even if it goes A2, A3, A1 all 3 will be executed at the same time. Now on Agent side the PerformAction is blocking, even though the communication is not.

So the order of actions is still serial on Agent side:

PerformAction("action1") // block waiting for response PerformAction("action2"). // this wont even be added to the channel until action1 completes, fails, or timesout

Thanks for the explanation, 👍

ph

LGTM, added a few comments but testing looks good.

ph · 2020-05-29T13:37:04Z

dev-tools/generate_notice.py

@@ -434,6 +440,7 @@ def detect_license_summary(content):
    "MPL-2.0",
    "UPL-1.0",
    "ISC",
+    "ELASTIC",
 ]
 SKIP_NOTICE = []



ph · 2020-05-29T14:27:42Z

x-pack/elastic-agent/pkg/core/server/server.go

+			if err != nil {
+				// failed to send action; add back to channel to retry on re-connect from the client
+				appState.actionsLock.Unlock()
+				appState.pendingActions <- pending


I don't see why they would be out of order we append them to the pendingAction channel?

ph · 2020-05-29T14:35:26Z

x-pack/elastic-agent/pkg/core/server/server.go

+			ConfigStateIdx: as.statusConfigIdx, // stopping always inform that the config it has is correct
+			Config:         "",
+		}
+	} else if checkin.ConfigStateIdx != as.expectedConfigIdx {


should this check also be covered in the lock above should we indeed defer the unlock of the struct?

I have added it into the lock as well, thanks for pointing that out.

ph · 2020-05-29T14:44:17Z

x-pack/elastic-agent/pkg/core/server/server.go

+				s := prevStatus
+				prevMessage := serverApp.statusMessage
+				message := prevMessage
+				if serverApp.status == proto.StateObserved_DEGRADED {


love that a lot.

Will, we ever try to restart a process if the watchdog doesn't have news of the client for an extended period of time, I am curious what would be the actions required to recover from that state? (can be a followup)

i think we can add a hook here and decide later, with forking we have a watchdog for process to be killed so it could handle also this callback

Yes that is what the OnStatusChange is given when the server is started. We will add the logic in that callback to handle when an application is marked FAILED.

blakerouse · 2020-05-29T21:43:32Z

Removing the requirement for go 1.14 requires elastic/elastic-agent-client#9 to land so I can vendor it into this PR.

ph

LGTM

* Work on the GRPC server for agent. * Lots of testing. * Fix data races. * Add support for elastic license in generate_notice.py. * Update to generate server name unique per application. * Fix go vet on stackdriver metricset using latest protobuf. * Fix data race issue. * Fix tests. (cherry picked from commit 6e91ce4)

…-stage-level * upstream/master: (30 commits) Add a GRPC listener service for Agent (elastic#18827) Disable host.* fields by default for iptables module (elastic#18756) [WIP] Clarify capabilities of the Filebeat auditd module (elastic#17068) fix: rename file and remove extra separator (elastic#18881) ci: enable JJBB (elastic#18812) Disable host.* fields by default for Checkpoint module (elastic#18754) Disable host.* fields by default for Cisco module (elastic#18753) Update latest.yml testing env to 7.7.0 (elastic#18535) Upgrade k8s.io/client-go and k8s keystore tests (elastic#18817) Add missing Jenkins stages for Auditbeat (elastic#18835) [Elastic Log Driver] Create a config shim between libbeat and the user (elastic#18605) Use indexers and matchers in config when defaults are enabled (elastic#18818) Fix panic on `metricbeat test modules` (elastic#18797) [CI] Fix permissions in MacOSX agents (elastic#18847) [Ingest Manager] When not port are specified and the https is used fallback to 443 (elastic#18844) [Ingest Manager] Fix install service script for windows (elastic#18814) [Metricbeat] Fix getting compute instance metadata with partial zone/region config (elastic#18757) Improve error messages in s3 input (elastic#18824) Add memory metrics into compute googlecloud (elastic#18802) include bucket name when logging error (elastic#18679) ...

* Work on the GRPC server for agent. * Lots of testing. * Fix data races. * Add support for elastic license in generate_notice.py. * Update to generate server name unique per application. * Fix go vet on stackdriver metricset using latest protobuf. * Fix data race issue. * Fix tests. (cherry picked from commit 6e91ce4)

blakerouse added the Team:Ingest Management label May 28, 2020

blakerouse self-assigned this May 28, 2020

botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels May 28, 2020

blakerouse requested a review from a team May 28, 2020 18:49

blakerouse changed the title ~~Agent grpc server~~ Add a GRPC listener service for Agent May 28, 2020

blakerouse mentioned this pull request May 28, 2020

Update go version to go 1.14.3 #18829

Closed

5 tasks

michalpristas self-requested a review May 29, 2020 06:38

michalpristas approved these changes May 29, 2020

View reviewed changes

ph approved these changes May 29, 2020

View reviewed changes

ph approved these changes Jun 1, 2020

View reviewed changes

blakerouse added 5 commits June 1, 2020 12:22

Work on the GRPC server for agent.

87b5582

Lots of testing.

2598b5a

Fix data races.

7c7a53e

Add support for elastic license in generate_notice.py.

ab19e72

Update to generate server name unique per application.

320259c

blakerouse force-pushed the agent-grpc-server branch from a56aa73 to 320259c Compare June 1, 2020 16:23

blakerouse added 3 commits June 1, 2020 13:01

Fix go vet on stackdriver metricset using latest protobuf.

e78211c

Fix data race issue.

afebdec

Fix tests.

dc80cf2

blakerouse merged commit 6e91ce4 into elastic:master Jun 1, 2020

blakerouse deleted the agent-grpc-server branch June 1, 2020 20:35

blakerouse mentioned this pull request Jun 1, 2020

Cherry-pick #18827 to 7.x: Add a GRPC listener service for Agent #18889

Merged

4 tasks

blakerouse added the v7.9.0 label Jun 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a GRPC listener service for Agent #18827

Add a GRPC listener service for Agent #18827

blakerouse commented May 28, 2020 •

edited

Loading

elasticmachine commented May 28, 2020

ph commented May 28, 2020

elasticmachine commented May 28, 2020 •

edited

Loading

Build stats

Test stats 🧪

blakerouse commented May 28, 2020

graphaelli commented May 28, 2020

axw commented May 29, 2020

michalpristas left a comment

michalpristas May 29, 2020

michalpristas May 29, 2020

michalpristas May 29, 2020

michalpristas May 29, 2020

michalpristas May 29, 2020

ph May 29, 2020

michalpristas Jun 1, 2020

blakerouse Jun 1, 2020

ph Jun 1, 2020

ph left a comment

ph May 29, 2020

ph May 29, 2020

ph May 29, 2020

blakerouse Jun 1, 2020

ph May 29, 2020

ph May 29, 2020

michalpristas Jun 1, 2020

blakerouse Jun 1, 2020

blakerouse commented May 29, 2020

ph left a comment

Add a GRPC listener service for Agent #18827

Add a GRPC listener service for Agent #18827

Conversation

blakerouse commented May 28, 2020 • edited Loading

What does this PR do?

Why is it important?

Checklist

Author's Checklist

How to test this PR locally

Related issues

elasticmachine commented May 28, 2020

ph commented May 28, 2020

elasticmachine commented May 28, 2020 • edited Loading

💔 Tests Failed

Build stats

Test stats 🧪

Test errors

Steps errors

Log output

blakerouse commented May 28, 2020

graphaelli commented May 28, 2020

axw commented May 29, 2020

michalpristas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ph left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

blakerouse commented May 29, 2020

ph left a comment

Choose a reason for hiding this comment

blakerouse commented May 28, 2020 •

edited

Loading

elasticmachine commented May 28, 2020 •

edited

Loading