Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v2] Use the v2 components runtime as the core of the Elastic Agent #753

Merged
merged 15 commits into from
Jul 26, 2022

Conversation

blakerouse
Copy link
Contributor

What does this PR do?

Changes the core management of the Elastic Agent to use the coordinator that manages the interaction between the configuration manager, runtime manager, and composable variable manager.

At the moment the unit tests will fail, there is still more to do, and more code to remove (like the operators, programs, old specifications, etc.). Some code was already removed as it help to find areas that needed to be fixed and just made the job easier to remove them.

This is being proposed because its already a large chunk of changes and it would be better to get it in there now vs keeping it local to me.

Why is it important?

Work towards switching to new v2 design of the Elastic Agent.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • [ ] I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

@blakerouse blakerouse added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Jul 20, 2022
@blakerouse blakerouse requested a review from a team as a code owner July 20, 2022 20:32
@blakerouse blakerouse self-assigned this Jul 20, 2022
@blakerouse blakerouse requested review from AndersonQ and michel-laterman and removed request for a team July 20, 2022 20:32
@elasticmachine
Copy link
Contributor

elasticmachine commented Jul 20, 2022

💔 Build Failed

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-07-22T13:56:09.966+0000

  • Duration: 4 min 49 sec

Steps errors 1

Expand to view the steps failures

check
  • Took 0 min 12 sec . View more details here
  • Description: make check-ci

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages.

  • run integration tests : Run the Elastic Agent Integration tests.

  • run end-to-end tests : Generate the packages and run the E2E Tests.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

Copy link
Contributor

@michel-laterman michel-laterman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tried to take a look, I think I would benefit from a short walkthrough on what is being moved/replaced (i.e., where do fleet interactions occur now)

@@ -35,12 +35,24 @@ type ComponentUnitKey struct {
UnitID string
}

// ComponentVersionInfo provides version information reported by the component.
type ComponentVersionInfo struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 this is super useful for us

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benefit of v2 protocol!

go func() {
select {
case <-ctx.Done():
case sub.ch <- latestState:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this being done in a goroutine now that ch is unbuffered?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

@@ -7,14 +7,14 @@ package actions
import (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the flatter namespaces, it will make it much easier to find in the future.

@@ -48,6 +48,12 @@ var (
ErrNoUnit = errors.New("no unit under control of this manager")
)

// ComponentComponentState provides a structure to map a component to current component state.
type ComponentComponentState struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ComponentComponentState is repetitive, can we rename this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open to suggestions, I was not happy with it either but couldn't think of a better name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExecutionState ComponentExecutionState ?

@@ -2,27 +2,29 @@
// or more contributor license agreements. Licensed under the Elastic License;
// you may not use this file except in compliance with the Elastic License.

package fleet
package noop
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just used in testing? should we define a testing package instead to make its usage clear?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It used to be used not just in testing, but with the changes in this branch it is now only used by tests.

I think we could clean that up in a followup. Its a very small file and it hurting anything where it is sitting at the moment.

}

func (l *policyChange) Ack() error {
if l.action == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these linter items looks like this issue #486

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I thought the lint there was very weird. Going to ignore it in the PR, but we should look into fixing it.

go func() {
if err := p.work(); err != nil {
p.log.Debugf("Failed to read configuration, error: %s", err)
func (p *periodic) Run(ctx context.Context) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like the removal of labels 💯

@@ -112,40 +82,17 @@ func (u *Upgrader) Upgradeable() bool {

// Upgrade upgrades running agent, function returns shutdown callback if some needs to be executed for cases when
// reexec is called by caller.
func (u *Upgrader) Upgrade(ctx context.Context, a Action, reexecNow bool) (_ reexec.ShutdownCallbackFn, err error) {
func (u *Upgrader) Upgrade(ctx context.Context, version string, sourceURI string, action *fleetapi.ActionUpgrade) (_ reexec.ShutdownCallbackFn, err error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the upgrade action may have it's own sourceURI param, is that going to be passed as the sourceURI here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it gets passed in from the Upgrade action handler from the Fleet Gateway, as well as from the control protocol.

//nolint:errcheck // keeping the same behavior, and making linter happy
u.ackAction(ctx, action)
}
u.log.Warn("upgrading to same version")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this situation ever occur outside of our testing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should not happen ever.

Comment on lines 121 to 129
state.signal = notify
err := state.provider.Run(state)
if err != nil {
cancel()
return errors.New(err, fmt.Sprintf("failed to run provider '%s'", name), errors.TypeConfig, errors.M("provider", name))
}
go func() {
defer wg.Done()
err := state.provider.Run(state)
if err != nil && !errors.Is(err, context.Canceled) {
err = errors.New(err, fmt.Sprintf("failed to run provider '%s'", name), errors.TypeConfig, errors.M("provider", name))
c.logger.Errorf("%s", err)
}
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are correct here, seems lint below caught it but not this one. I fixed both.

Copy link
Contributor Author

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to get on a call and walk through any part of the PR.

}

func (l *policyChange) Ack() error {
if l.action == nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I thought the lint there was very weird. Going to ignore it in the PR, but we should look into fixing it.

@@ -112,40 +82,17 @@ func (u *Upgrader) Upgradeable() bool {

// Upgrade upgrades running agent, function returns shutdown callback if some needs to be executed for cases when
// reexec is called by caller.
func (u *Upgrader) Upgrade(ctx context.Context, a Action, reexecNow bool) (_ reexec.ShutdownCallbackFn, err error) {
func (u *Upgrader) Upgrade(ctx context.Context, version string, sourceURI string, action *fleetapi.ActionUpgrade) (_ reexec.ShutdownCallbackFn, err error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it gets passed in from the Upgrade action handler from the Fleet Gateway, as well as from the control protocol.

//nolint:errcheck // keeping the same behavior, and making linter happy
u.ackAction(ctx, action)
}
u.log.Warn("upgrading to same version")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should not happen ever.

for _, app := range status.Applications {
fmt.Fprintf(tw, " * %s\t(%s)\n", app.Name, app.Status)
if app.Message == "" {
for _, comp := range state.Components {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do want to greatly improve this output. It is something that needs to be done in a follow up. At the moment I got it working just enough for the status output to provide data. JSON and YAML output provide all the detail at the moment.

I want to add the ability to do elastic-agent status --watch that will use something like tcell to provide a stream of status changes for the agent, components, and units.

// DefaultGRPCConfig creates a default server configuration.
func DefaultGRPCConfig() *GRPCConfig {
return &GRPCConfig{
Address: "localhost",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we do. Would like to do that before the release of v2, but only once we get v2 working.

@@ -48,6 +48,12 @@ var (
ErrNoUnit = errors.New("no unit under control of this manager")
)

// ComponentComponentState provides a structure to map a component to current component state.
type ComponentComponentState struct {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open to suggestions, I was not happy with it either but couldn't think of a better name.

go func() {
select {
case <-ctx.Done():
case sub.ch <- latestState:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

@@ -4,4 +4,4 @@

package version

const defaultBeatVersion = "8.4.0"
const defaultBeatVersion = "8.3.0"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it was not. I changed it back. I only did that to test my locally built agent with Elastic Cloud 8.3.

@@ -35,12 +35,24 @@ type ComponentUnitKey struct {
UnitID string
}

// ComponentVersionInfo provides version information reported by the component.
type ComponentVersionInfo struct {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benefit of v2 protocol!

// or more contributor license agreements. Licensed under the Elastic License;
// you may not use this file except in compliance with the Elastic License.

package coordinator
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whats the benefit for package doc string? I added doc strings to all public structs, interfaces, and functions. Not clear to me what you would want here.

It's also not something done across the code base at the moment. Is that something we want to start doing for the entire code base?

Copy link
Contributor

@michel-laterman michel-laterman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 looking forward to the deletions!

@blakerouse blakerouse merged commit 5acdc40 into elastic:feature-arch-v2 Jul 26, 2022
@blakerouse blakerouse deleted the v2-runtime-start branch July 26, 2022 17:22
if configuration.IsStandalone(cfg.Fleet) {
log.Info("Agent is managed locally")
return newLocal(ctx, log, paths.ConfigFile(), rawConfig, reexec, statusCtrl, uc, agentInfo, tracer)
log.Info("Parsed configuration and determined agent is managed locally")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for the improved wording, but should we use standalone term here since we are often using this term to define the behavior of an agent without fleet.

// or more contributor license agreements. Licensed under the Elastic License;
// you may not use this file except in compliance with the Elastic License.

package coordinator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should start doing it, this help define the vision of a package and reduce the risk (hopefully) to have a catch all utils or common package in beats were everything is push to a package.

// ErrNotUpgradable error is returned when upgrade cannot be performed.
ErrNotUpgradable = errors.New(
"cannot be upgraded; must be installed with install sub-command and " +
"running under control of the systems supervisor")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe?

Suggested change
"running under control of the systems supervisor")
"running under control of the operating system supervisor")

// ReExecManager provides an interface to perform re-execution of the entire agent.
type ReExecManager interface {
ReExec(callback reexec.ShutdownCallbackFn, argOverrides ...string)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on small interface.

// UpgradeManager provides an interface to perform the upgrade action for the agent.
type UpgradeManager interface {
// Upgradeable returns true if can be upgraded.
Upgradeable() bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a predicate maybe add a verb prefix?

Suggested change
Upgradeable() bool
// IsUpgradeable returns true if can be upgraded.
IsUpgradeable() bool

log *logger.Logger
handlers actionHandlers
def actions.Handler
}

// New creates a new action dispatcher.
func New(ctx context.Context, log *logger.Logger, def actions.Handler) (*ActionDispatcher, error) {
func New(log *logger.Logger, def actions.Handler) (*ActionDispatcher, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 bad pattern to keep the context as a field.

func (f *fleetGateway) worker() {
func (f *fleetGateway) Run(ctx context.Context) error {
// Backoff implementation doesn't support the use of a context [cancellation] as the shutdown mechanism.
// So we keep a done channel that will be closed when the current context is shutdown.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/facepalm, I did this :(

cfg.AppConfig[p.Identifier()+"_"+rk] = p.Configuration()
}
}
*/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

disabled to reenable it later right @blakerouse?

// DefaultGRPCConfig creates a default server configuration.
func DefaultGRPCConfig() *GRPCConfig {
return &GRPCConfig{
Address: "localhost",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the idea is to remove the network overhead and the possibility of port conflicts.

@@ -48,6 +48,12 @@ var (
ErrNoUnit = errors.New("no unit under control of this manager")
)

// ComponentComponentState provides a structure to map a component to current component state.
type ComponentComponentState struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExecutionState ComponentExecutionState ?

@ph
Copy link
Contributor

ph commented Jul 26, 2022

@blakerouse Good job on this, you merge it before I finished the review, nothing major on my side added a few suggestions.

blakerouse added a commit that referenced this pull request Nov 9, 2022
* [v2] Add v2 component specification and validation. (#502)

* Add v2 component specification and validation.

* Remove i386 and ppc64el. Update spec for osquerybeat.

* Remove windows/arm64.

* Add component spec command to validate component specifications. (#510)

* [v2] Calculate the expected runtime components from policy (#550)

* Upgrade elastic-agent-client.

* Calculate the expected running components and units from the v2 specification and the current policy.

* Update NOTICE.txt.

* Fix lint from servicable main.go.

* Update GRPC for the agent CLI control protocol. Fix name collision issue.

* Run go mod tidy.

* Fix more lint issues.

* Fix fmt.

* Update logic to always compute model, with err set on each component. Check runtime preventions at model generation time.

* Fix items from code review, and issue on windows test runner.

* Try to cleanup duplication in tests.

* Try 2 of fixing duplicate lint failure, that is not really a duplicate.

* Re-run mage fmt.

* Lint fixes for linux, why different?

* Fix nolint comment.

* Add comment.

* Initial Flat Structure (#544)

Flattening the structure and removing download/install steps for programs. 

Co-authored-by: Aleksandr Maus <[email protected]>

* Generate checksum file for components (#604)

* generating checksum?

* yaml output

* Update dev-tools/mage/common.go

Co-authored-by: Michel Laterman <[email protected]>

* review

* ioutil removal from magefile

Co-authored-by: Michel Laterman <[email protected]>

* V2 Runtime Component Manager (#645)

* Add runtime for command v2 components.

* Fix imports.

* Add tests for watching checkins.

* Fix lint and move checkin period to a configurable timeout.

* Fix tests now that checkin timeout needs to be defined.

* Fix code review and lint.

* [v2] Use the v2 components runtime as the core of the Elastic Agent (#753)

* Add runtime for command v2 components.

* Fix imports.

* Add tests for watching checkins.

* Fix lint and move checkin period to a configurable timeout.

* Fix tests now that checkin timeout needs to be defined.

* Fix code review and lint.

* Work on actually running the v2 runtime.

* Work on switching to the v2 runtime.

* More work on switching to v2 runtime.

* Cleanup some imports.

* More import cleanups.

* Add TODO to FleetServerComponentModifier.

* Remove outdated managed_mode_test.go.

* Fixes from code review and lint.

* [v2] Delete unused code from refactor (#777)

* Add runtime for command v2 components.

* Fix imports.

* Add tests for watching checkins.

* Fix lint and move checkin period to a configurable timeout.

* Fix tests now that checkin timeout needs to be defined.

* Fix code review and lint.

* Work on actually running the v2 runtime.

* Work on switching to the v2 runtime.

* More work on switching to v2 runtime.

* Cleanup some imports.

* More import cleanups.

* Add TODO to FleetServerComponentModifier.

* More cleanup and removals.

* Remove more.

* Delete more unused code.

* Clean up step_download from refactor.

* Remove outdated managed_mode_test.go.

* Fixes from code review and lint.

* Fix lint and missing errcheck.

* [v2] Delete more unused code from v2 transition (#790)

* Remove more unused code that was including already deleted code.

* Fix all unit tests.

* Fix lint.

* More lint fixes, maybe this time?

* More lint.... really?

* Update NOTICE.txt.

* [v2] Merge July 27th main into v2 feature branch (#789)

* [Automation] Update elastic stack version to 8.4.0-40cff009 for testing (#557)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-5e6770b1 for testing (#564)

Co-authored-by: apmmachine <[email protected]>

* Fix regression and use comma separated values (#560)

Fix regression from https://github.com/elastic/elastic-agent/pull/509

* Change in Jenkinsfile will trigger k8s run (#568)

* [Automation] Update elastic stack version to 8.4.0-da5a1c6d for testing (#573)

Co-authored-by: apmmachine <[email protected]>

* Add `@metadata.input_id` and `@metadata.stream_id` when injecting streams (#527)

These 2 value are going to be used in the shipper to identify where an
event came from in order to apply processors accordingly.

Also, added test cases for the processor to verify the change and updated test cases with the new processor.

* Add filemod times to contents of diagnostics collect command (#570)

* Add filemod times to contents of diagnostics collect command

Add filemod times to the files and directories in the zip archive.
Log files (and sub dirs) will use the modtime returned by the fileinfo
for the source. Others will use the timestamp from when the zip is
created.

* Fix linter

* [Automation] Update elastic stack version to 8.4.0-b13123ee for testing (#581)

Co-authored-by: apmmachine <[email protected]>

* Fix Agent upgrade 8.2->8.3 (#578)

* Fix Agent upgrade 8.2->8.3
* Improve the upgrade encryption handling. Add .yml files cleanup.
* Rollback ActionUpgrade to action_id, add MarkerActionUpgrade adapter struct for marker serialization compatibility

* Update containerd (#577)

* [Automation] Update elastic stack version to 8.4.0-4fe26f2a for testing (#591)

Co-authored-by: apmmachine <[email protected]>

* Set explicit ExitTimeOut for MacOS agent launchd plist (#594)

* Set explicit ExitTimeOut for MacOS agent launchd plist

* [Automation] Update elastic stack version to 8.4.0-2e32a640 for testing (#599)

Co-authored-by: apmmachine <[email protected]>

* ci: enable build notifications as GitHub issues (#595)

* status identifies failing component, fleet gateway may report degraded, liveness endpoint added (#569)

* Add liveness endpoint

Add /liveness route to metrics server. This route will report the status
from pkg/core/status. fleet-gateway will now report a degraded state if
a checkin fails. This may not propogate to fleet-server as a failed
checkin means communications between the agent and the server are not
working. It may also lead to the server reporting degraded for up to 30s
(fleet-server polling time) when teh agent is able to successfully
connect.

* linter fix

* add nolint direcrtive

* Linter fix

* Review feedback, add doc strings

* Rename noop controller file to _test file

* [Automation] Update elastic stack version to 8.4.0-722a7d79 for testing (#607)

Co-authored-by: apmmachine <[email protected]>

* ci: enable flaky test detector (#605)

* [Automation] Update elastic stack version to 8.4.0-210dd487 for testing (#620)

Co-authored-by: apmmachine <[email protected]>

* mergify: remove backport automation for non active branches (#615)

* chore: use elastic-agent profile to run the E2E tests (#610)

* [Automation] Update elastic stack version to 8.4.0-a6aa9f3b for testing (#631)

Co-authored-by: apmmachine <[email protected]>

* add macros pointing to new agent's repo and fix old macro calls (#458)

* Add mount of /etc/machine-id for managed Agent in k8s (#530)

* Set hostPID=true for managed agent in k8s (#528)

* Set hostPID=true for managed agent in k8s

* Add comment on hostPID.

* [Automation] Update elastic stack version to 8.4.0-86cc80f3 for testing (#648)

Co-authored-by: apmmachine <[email protected]>

* Update elastic-agent-libs version: includes restriction on default VerificationMode to `full` (#521)

* update version

* mage fmt update

* update dependency

* update changelog

* redact sensitive information in diagnostics collect command (#566)

* Support Cloudbeat regex input type  (#638)

* support input type with regex

* Update supported.go

* Changing the regex to support backward compatible

* Disable flaky test download test (#641)

* [Automation] Update elastic stack version to 8.4.0-3d206b5d for testing (#656)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-3ad82aa8 for testing (#661)

Co-authored-by: apmmachine <[email protected]>

* jjbb: exclude allowed branches, tags and PRs (#658)

cosmetic change in the description and boolean based

* Update elastic-agent-project-board.yml (#649)

* ci: fix labels that clashes with the Orka workers (#659)

* [Automation] Update elastic stack version to 8.4.0-03bd6f3f for testing (#668)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-533f1e30 for testing (#675)

Co-authored-by: apmmachine <[email protected]>

* Osquerybeat: Fix osquerybeat is not running with logstash output (#674)

* [Automation] Update elastic stack version to 8.4.0-d0a4da44 for testing (#684)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-dd98ded4 for testing (#703)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-164d9a10 for testing (#705)

Co-authored-by: apmmachine <[email protected]>

* Add missing license headers (#711)

* [Automation] Update elastic stack version to 8.4.0-00048b66 for testing (#713)

Co-authored-by: apmmachine <[email protected]>

* Allow - in eql variable names (#710)

* fix to allow dashes in variable names in EQL expressions

extend eql to allow the '-' char to appear in variable names, i.e.,
${data.some-var} and additional test cases to eql, the transpiler, and
the k8s provider to verify this works. Note that the bug was caused by
the EQL limitation, the otehr test cases were added when attempting to
find it.

* Regenerate grammer with antlr 4.7.1, add CHANGELOG

* Fix linter issue

* Fix typo

* Fix transpiler to allow : in dynamic variables. (#680)

Fix transpiler regex to allow ':' characters in dynamic variables so
that users can input "${dynamic.lookup|'fallback.here'}".

Co-authored-by: Aleksandr Maus <[email protected]>

* Fix for the filebeat spec file picking up packetbeat inputs (#700)

* Reproduce filebeat picking up packetbeat inputs

* Filebeat: filter inputs as first input transform.

Move input filtering to be the first input transformation that occurs in
the filebeat spec file. Fixes
https://github.com/elastic/elastic-agent/issues/427.

* Update changelog.

* [Automation] Update elastic stack version to 8.4.0-3cd57abb for testing (#724)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-a324b98b for testing (#727)

Co-authored-by: apmmachine <[email protected]>

* ci: run on MacOS12 (#696)

* [Automation] Update elastic stack version to 8.4.0-31315ca3 for testing (#732)

Co-authored-by: apmmachine <[email protected]>

* fix typo on package command (#734)

This commit fixes the typo in the package command on the README.md.

* Allow / to be used in variable names (#718)

* Allow the / character to be used in variable names.

Allow / to be used in variable names from dynamic providers and eql
expressions. Ensure that k8s providers can provide variables with
slashes in their names.

* run antlr4

* Fix tests

* Fix Elastic Agent non-fleet broken upgrade between 8.3.x releases (#701)

* Fix Elastic Agent non-fleet broken upgrade between 8.3.x releases

* Migrates vault directory on linux and windows to the top directory of the
  agent, so it can be shared without needing the upgrade handler call,
  like for example with side-by-side install/upgrade from .rpm/.deb
* Extended vault to allow read-only open, useful when the vault at particular location needs to be only read not created.

* Correct the typo in the log messages

* Update lint flagged function comment with 'unused', was flagged with 'deadcode' on the previous run

* Address code review feedback

* Add missing import for linux utz

* Change vault path from Top() to Config(), this a better location, next to fleet.enc based on the install/upgrade testing with .rpm/.deb installs

* Fix the missing state migration for .rpm/.deb upgrade. The post install script now performs the migration and creates the symlink after that.

* Fix typo in the postinstall script

* Update the vault migration code, add the agent configuration match check with the agent secret

* [Automation] Update elastic stack version to 8.4.0-31269fd2 for testing (#746)

Co-authored-by: apmmachine <[email protected]>

* wrap errors and fix some docs typo and convention (#743)

* automate the ironbank docker context generation (#679)

* Update README.md

Adding M1 variable to export to be able to build AMD images

* fix flaky (#730)

* Add filestream ID on standalone kubernetes manifest (#742)

This commit add unique IDs for the filestream inputs used by the
Kubernetes integration in the Elastic-Agent standalone
Kubernetes configuration/manifest file.

* Alter github action to run on different OSs (#769)

Alter the linter action to run on different OSs instead of on linux with
the $GOOS env var.

* [Automation] Update elastic stack version to 8.4.0-d058e92f for testing (#771)

Co-authored-by: apmmachine <[email protected]>

* elastic-agent manifests: add comments; add cloudnative team as a codeowner for the k8s manifests (#708)

* managed elastic-agent: add comments; add cloudnative team as a codeowner for the k8s manifests

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add comments to the standalone elastic-agent, similar to the documentation we have https://www.elastic.co/guide/en/fleet/current/running-on-kubernetes-standalone.html

Signed-off-by: Tetiana Kravchenko <[email protected]>

* Apply suggestions from code review

Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>

* remove comment for FLEET_ENROLLMENT_TOKEN; use Needed everywhere instead of Required

Signed-off-by: Tetiana Kravchenko <[email protected]>

* rephrase regarding accessing kube-state-metrics when used third party tools, like kube-rbac-proxy

Signed-off-by: Tetiana Kravchenko <[email protected]>

* run make check

Signed-off-by: Tetiana Kravchenko <[email protected]>

* keep manifests in sync to pass ci check

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add info on where to find FLEET_URL and FLEET_ENROLLMENT_TOKEN

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add links to elastic-agent documentation

Signed-off-by: Tetiana Kravchenko <[email protected]>

* update comment on FLEET_ENROLLMENT_TOKEN

Signed-off-by: Tetiana Kravchenko <[email protected]>

Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>

* [Elastic-Agent] Added source uri reloading (#686)

* Update will cleanup unneeded artifacts. (#752)

* Update will cleanup unneeded artifacts.

The update process will cleanup unneeded artifacts. When an update
starts all artifacts that do not have the current version number in it's
name will be removed. If artifact retrieval fails, downloaded artifacts
are removed. On a successful upgrade, all contents of the downloads dir
will be removed.

* Clean up linter warnings

* Wrap errors

* cleanup tests

* Fix passed version

* Use os.RemoveAll

* ci: propagate e2e-testing errors (#695)

* [Release] add-backport-next (#784)

* Update go.sum.

* Fix upgrade.

* Fix the upgrade artifact reload.

* Fix lint in coordinator.

Co-authored-by: apmmachine <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: Pier-Hugues Pellerin <[email protected]>
Co-authored-by: Denis Rechkunov <[email protected]>
Co-authored-by: Michel Laterman <[email protected]>
Co-authored-by: Aleksandr Maus <[email protected]>
Co-authored-by: Victor Martinez <[email protected]>
Co-authored-by: Manuel de la Peña <[email protected]>
Co-authored-by: Anderson Queiroz <[email protected]>
Co-authored-by: Daniel Araujo Almeida <[email protected]>
Co-authored-by: Mariana Dima <[email protected]>
Co-authored-by: ofiriro3 <[email protected]>
Co-authored-by: Julien Lind <[email protected]>
Co-authored-by: Craig MacKenzie <[email protected]>
Co-authored-by: Tiago Queiroz <[email protected]>
Co-authored-by: Pierre HILBERT <[email protected]>
Co-authored-by: Tetiana Kravchenko <[email protected]>
Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>
Co-authored-by: Michal Pristas <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>

* [v2] Fix inspect command (#805)

* Write the inspect command for v2.

* Fix lint.

* Fix code review. Load inputs from inputs.d for inspect.

* Fix lint.

* Refactor to use errgroup.

* Remove unused struct.

* Expand check-in payload for V2 (#916)

* Expand check-in payload for V2

* Make linter happy

* [v2] Update protocol to use new UnitExpectedConfig. (#850)

* Update v2 protocol to use new UnitExpectedConfig.

* Cleanup.

* Update NOTICE.txt. Lint dupl.

* Fix code review. Ensure type is set to real type and not alias.

* Fix action dispatching that was using ActionType instead of InputType as before (#973)

* Fix bootstrapping a Fleet Server with v2. (#1010)

* Fix bootstrapping a Fleet Server with v2.

* Fix lint.

* Fix tests.

* Query just related files on build (#1045)

* Update main to 8.5.0 (#793) (#1050)

(cherry picked from commit 317e03116aa919d69be97242207ad11a28c826aa)

Co-authored-by: Pier-Hugues Pellerin <[email protected]>

* Create archive directory if it doesn't exist. (#1058)

On an M1 Mac rename seems to fail if the containing directories do not
already exist.

* fixed docker build (#1105)

* V2 command work dir (#1061)

* Fix v2 work directory for command. Add permission check for execution. Add determining root into runtime prevention.

* Add writeable by group and other in check.

* Fix restart and stopping issues in command runtime for failing binaries.

* Fix issue in endpoint spec. Allow an input to not require an ID, but that ID must be unique.

* Remove unused transpiler rules and steps.

* Fix test.

* Fix workDir for windows.

* Reset to checkin period.

* Fix test and code review issues.

* Add extra log message in unit test.

* More fixes from code review.

* Fix test.

* [v2] Move queue management to dispatcher (#1109)

* Move queue management to dispatcher

Move queue management actions to the dispatcher from the fleet-server
in order to help with future work to add a retry mechanism. Add a
PersistedQueue type which wrap the ActionQueue to make persisting the
queue simpler for the consumer.

* Refactor ActionQueue

Refactor ActionQueue to only export methods that are used by consumers.
The priority queue implementation has been changed to an unexported
type. Persistency has been added and the persistedqueue type has been
removed.

* Rename persistedQueue interface to priorityQueue

* Review feedback

* failing to save queue will log message

* Chagne gateway to use copy

* Fix [V2]: Elastic Agent Install is broken. (#1331)

* Fix agent shutdown on SIGINT (#1258)

* Fix agent shutdown on SIGINT

* Update runtime_comm expected check-in handling to eliminate the lock in failure cases

* Remove some buffered channels that are not longer blocking shutdown after the runtime comms fix commit

* Fix the recursive lock on itself in the runtime loop, refactored code to make it cleaner

* Fix the comment typo

* Fixed managed_mode coordination with fleet gateway. Now the gateway errors reading loop waits until gateway exits. Otherwise the gateway shutdown out of sequence blocks on errCh

* Fix linter

* Fix make check-ci

* Fix runner Err() possible race

* Update the runer DoneWithTimeout implementation

* Address code review comments

* [v2] Re-enable diagnostics for Elastic Agent and all components (#1140)

* Add diagnostics back to v2.

* Update pkg/component/runtime/manager.go

Co-authored-by: Anderson Queiroz <[email protected]>

Co-authored-by: Anderson Queiroz <[email protected]>

* Check and create downloads dir before using (#1410)

* [v2] Add upgrade action retry (#1219)

* Add upgrade action retry

Add the ability for the agent to schedule and retry upgrade actions.

The fleetapi actions now define a ScheduledAction, and RetryableAction interface to eliminate the need for stub methods on all different action types. Action queue has been changed to function on scheduled actions. Serialization tests now ensure that that the retry attribute needed by retryable actions works.

Decouple dispatcher from gateway, dispatcher has an errors channel that will return an error for the list of actions that's sent. Gateway has an Actions method that can be used to get the list of actions from the gateway. The managed_mode config manager will link these two components

If a handler returns an error and the action is a RetryableAction, the dispatcher will attempt to schedule a retry. The dispatcher will also ack the action to fleet-server and indicate if it will be retried or has failed (or has been received normally).
For the acker, if a RetryableAction has an error and an attempt count that is greater than 0 it will be acked as retried. If it has an error and an attempt count less than 1 it will be acked as failed.

Co-authored-by: Blake Rouse <[email protected]>

* V1 metrics monitoring for V2 (#1487)

V1 metrics monitoring for V2 (#1487)

* [v2] Merge main on Oct. 18 (#1557)

* [Automation] Update elastic stack version to 8.4.0-40cff009 for testing (#557)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-5e6770b1 for testing (#564)

Co-authored-by: apmmachine <[email protected]>

* Fix regression and use comma separated values (#560)

Fix regression from https://github.com/elastic/elastic-agent/pull/509

* Change in Jenkinsfile will trigger k8s run (#568)

* [Automation] Update elastic stack version to 8.4.0-da5a1c6d for testing (#573)

Co-authored-by: apmmachine <[email protected]>

* Add `@metadata.input_id` and `@metadata.stream_id` when injecting streams (#527)

These 2 value are going to be used in the shipper to identify where an
event came from in order to apply processors accordingly.

Also, added test cases for the processor to verify the change and updated test cases with the new processor.

* Add filemod times to contents of diagnostics collect command (#570)

* Add filemod times to contents of diagnostics collect command

Add filemod times to the files and directories in the zip archive.
Log files (and sub dirs) will use the modtime returned by the fileinfo
for the source. Others will use the timestamp from when the zip is
created.

* Fix linter

* [Automation] Update elastic stack version to 8.4.0-b13123ee for testing (#581)

Co-authored-by: apmmachine <[email protected]>

* Fix Agent upgrade 8.2->8.3 (#578)

* Fix Agent upgrade 8.2->8.3
* Improve the upgrade encryption handling. Add .yml files cleanup.
* Rollback ActionUpgrade to action_id, add MarkerActionUpgrade adapter struct for marker serialization compatibility

* Update containerd (#577)

* [Automation] Update elastic stack version to 8.4.0-4fe26f2a for testing (#591)

Co-authored-by: apmmachine <[email protected]>

* Set explicit ExitTimeOut for MacOS agent launchd plist (#594)

* Set explicit ExitTimeOut for MacOS agent launchd plist

* [Automation] Update elastic stack version to 8.4.0-2e32a640 for testing (#599)

Co-authored-by: apmmachine <[email protected]>

* ci: enable build notifications as GitHub issues (#595)

* status identifies failing component, fleet gateway may report degraded, liveness endpoint added (#569)

* Add liveness endpoint

Add /liveness route to metrics server. This route will report the status
from pkg/core/status. fleet-gateway will now report a degraded state if
a checkin fails. This may not propogate to fleet-server as a failed
checkin means communications between the agent and the server are not
working. It may also lead to the server reporting degraded for up to 30s
(fleet-server polling time) when teh agent is able to successfully
connect.

* linter fix

* add nolint direcrtive

* Linter fix

* Review feedback, add doc strings

* Rename noop controller file to _test file

* [Automation] Update elastic stack version to 8.4.0-722a7d79 for testing (#607)

Co-authored-by: apmmachine <[email protected]>

* ci: enable flaky test detector (#605)

* [Automation] Update elastic stack version to 8.4.0-210dd487 for testing (#620)

Co-authored-by: apmmachine <[email protected]>

* mergify: remove backport automation for non active branches (#615)

* chore: use elastic-agent profile to run the E2E tests (#610)

* [Automation] Update elastic stack version to 8.4.0-a6aa9f3b for testing (#631)

Co-authored-by: apmmachine <[email protected]>

* add macros pointing to new agent's repo and fix old macro calls (#458)

* Add mount of /etc/machine-id for managed Agent in k8s (#530)

* Set hostPID=true for managed agent in k8s (#528)

* Set hostPID=true for managed agent in k8s

* Add comment on hostPID.

* [Automation] Update elastic stack version to 8.4.0-86cc80f3 for testing (#648)

Co-authored-by: apmmachine <[email protected]>

* Update elastic-agent-libs version: includes restriction on default VerificationMode to `full` (#521)

* update version

* mage fmt update

* update dependency

* update changelog

* redact sensitive information in diagnostics collect command (#566)

* Support Cloudbeat regex input type  (#638)

* support input type with regex

* Update supported.go

* Changing the regex to support backward compatible

* Disable flaky test download test (#641)

* [Automation] Update elastic stack version to 8.4.0-3d206b5d for testing (#656)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-3ad82aa8 for testing (#661)

Co-authored-by: apmmachine <[email protected]>

* jjbb: exclude allowed branches, tags and PRs (#658)

cosmetic change in the description and boolean based

* Update elastic-agent-project-board.yml (#649)

* ci: fix labels that clashes with the Orka workers (#659)

* [Automation] Update elastic stack version to 8.4.0-03bd6f3f for testing (#668)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-533f1e30 for testing (#675)

Co-authored-by: apmmachine <[email protected]>

* Osquerybeat: Fix osquerybeat is not running with logstash output (#674)

* [Automation] Update elastic stack version to 8.4.0-d0a4da44 for testing (#684)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-dd98ded4 for testing (#703)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-164d9a10 for testing (#705)

Co-authored-by: apmmachine <[email protected]>

* Add missing license headers (#711)

* [Automation] Update elastic stack version to 8.4.0-00048b66 for testing (#713)

Co-authored-by: apmmachine <[email protected]>

* Allow - in eql variable names (#710)

* fix to allow dashes in variable names in EQL expressions

extend eql to allow the '-' char to appear in variable names, i.e.,
${data.some-var} and additional test cases to eql, the transpiler, and
the k8s provider to verify this works. Note that the bug was caused by
the EQL limitation, the otehr test cases were added when attempting to
find it.

* Regenerate grammer with antlr 4.7.1, add CHANGELOG

* Fix linter issue

* Fix typo

* Fix transpiler to allow : in dynamic variables. (#680)

Fix transpiler regex to allow ':' characters in dynamic variables so
that users can input "${dynamic.lookup|'fallback.here'}".

Co-authored-by: Aleksandr Maus <[email protected]>

* Fix for the filebeat spec file picking up packetbeat inputs (#700)

* Reproduce filebeat picking up packetbeat inputs

* Filebeat: filter inputs as first input transform.

Move input filtering to be the first input transformation that occurs in
the filebeat spec file. Fixes
https://github.com/elastic/elastic-agent/issues/427.

* Update changelog.

* [Automation] Update elastic stack version to 8.4.0-3cd57abb for testing (#724)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-a324b98b for testing (#727)

Co-authored-by: apmmachine <[email protected]>

* ci: run on MacOS12 (#696)

* [Automation] Update elastic stack version to 8.4.0-31315ca3 for testing (#732)

Co-authored-by: apmmachine <[email protected]>

* fix typo on package command (#734)

This commit fixes the typo in the package command on the README.md.

* Allow / to be used in variable names (#718)

* Allow the / character to be used in variable names.

Allow / to be used in variable names from dynamic providers and eql
expressions. Ensure that k8s providers can provide variables with
slashes in their names.

* run antlr4

* Fix tests

* Fix Elastic Agent non-fleet broken upgrade between 8.3.x releases (#701)

* Fix Elastic Agent non-fleet broken upgrade between 8.3.x releases

* Migrates vault directory on linux and windows to the top directory of the
  agent, so it can be shared without needing the upgrade handler call,
  like for example with side-by-side install/upgrade from .rpm/.deb
* Extended vault to allow read-only open, useful when the vault at particular location needs to be only read not created.

* Correct the typo in the log messages

* Update lint flagged function comment with 'unused', was flagged with 'deadcode' on the previous run

* Address code review feedback

* Add missing import for linux utz

* Change vault path from Top() to Config(), this a better location, next to fleet.enc based on the install/upgrade testing with .rpm/.deb installs

* Fix the missing state migration for .rpm/.deb upgrade. The post install script now performs the migration and creates the symlink after that.

* Fix typo in the postinstall script

* Update the vault migration code, add the agent configuration match check with the agent secret

* [Automation] Update elastic stack version to 8.4.0-31269fd2 for testing (#746)

Co-authored-by: apmmachine <[email protected]>

* wrap errors and fix some docs typo and convention (#743)

* automate the ironbank docker context generation (#679)

* Update README.md

Adding M1 variable to export to be able to build AMD images

* fix flaky (#730)

* Add filestream ID on standalone kubernetes manifest (#742)

This commit add unique IDs for the filestream inputs used by the
Kubernetes integration in the Elastic-Agent standalone
Kubernetes configuration/manifest file.

* Alter github action to run on different OSs (#769)

Alter the linter action to run on different OSs instead of on linux with
the $GOOS env var.

* [Automation] Update elastic stack version to 8.4.0-d058e92f for testing (#771)

Co-authored-by: apmmachine <[email protected]>

* elastic-agent manifests: add comments; add cloudnative team as a codeowner for the k8s manifests (#708)

* managed elastic-agent: add comments; add cloudnative team as a codeowner for the k8s manifests

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add comments to the standalone elastic-agent, similar to the documentation we have https://www.elastic.co/guide/en/fleet/current/running-on-kubernetes-standalone.html

Signed-off-by: Tetiana Kravchenko <[email protected]>

* Apply suggestions from code review

Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>

* remove comment for FLEET_ENROLLMENT_TOKEN; use Needed everywhere instead of Required

Signed-off-by: Tetiana Kravchenko <[email protected]>

* rephrase regarding accessing kube-state-metrics when used third party tools, like kube-rbac-proxy

Signed-off-by: Tetiana Kravchenko <[email protected]>

* run make check

Signed-off-by: Tetiana Kravchenko <[email protected]>

* keep manifests in sync to pass ci check

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add info on where to find FLEET_URL and FLEET_ENROLLMENT_TOKEN

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add links to elastic-agent documentation

Signed-off-by: Tetiana Kravchenko <[email protected]>

* update comment on FLEET_ENROLLMENT_TOKEN

Signed-off-by: Tetiana Kravchenko <[email protected]>

Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>

* [Elastic-Agent] Added source uri reloading (#686)

* Update will cleanup unneeded artifacts. (#752)

* Update will cleanup unneeded artifacts.

The update process will cleanup unneeded artifacts. When an update
starts all artifacts that do not have the current version number in it's
name will be removed. If artifact retrieval fails, downloaded artifacts
are removed. On a successful upgrade, all contents of the downloads dir
will be removed.

* Clean up linter warnings

* Wrap errors

* cleanup tests

* Fix passed version

* Use os.RemoveAll

* ci: propagate e2e-testing errors (#695)

* [Release] add-backport-next (#784)

* Update main to 8.5.0 (#793)

* [Automation] Update go release version to 1.17.12 (#726)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-60171339 for testing (#799)

Co-authored-by: apmmachine <[email protected]>

* update dependency elastic/go-structform from v0.0.9 to v0.0.10 (#802)

Signed-off-by: Florian Lehner <[email protected]>

* Fix unpacking of artifact config (#776)

Fix unpacking of artifact config (#776)

* [Automation] Update elastic stack version to 8.5.0-c54c3404 for testing (#826)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-7dbc10f8 for testing (#833)

Co-authored-by: apmmachine <[email protected]>

* Fix RPM/DEB clean install (#816)

* Fix RPM/DEB clean install

* Improve the post install script

* Do not try to copy the state files if the agent directory is the same,
  this causes the error.
* Check the existance of symlink instead of the file it is pointing to
  for the state file migration.

* Update check for symlink existance for the cases where the symlink points to non-existent file

* fix path for auto generated spec file (#859)

Signed-off-by: Florian Lehner <[email protected]>

* Reload downloader client on config change (#848)

Reload downloader client on config change (#848)

* Bundle elastic-agent.app for MacOS, needed to be able to enable the  … (#714)

* Bundle elastic-agent.app for MacOS, needed to be able to enable the  Full Disk Access

* Calm down the linter

* Fix pathing for windows unit test

* crossbuild: add fix to set ulimit for debian images (#856)

Signed-off-by: Florian Lehner <[email protected]>

* [Heartbeat] Cleanup docker install / always add playwright deps (#764)

This is the agent counterpart to elastic/beats#32122

Refactors Dockerfile handling of synthetics deps to rely on playwright install-deps rather than us manually keeping up to date with those. This should fix issues with newer playwrights needing additional deps.

This also cleans up the Dockerfile a good amount, and fixes indentation. Finally, this removes the unused Dockerfile.elastic-agent.tmpl file since agent is now its own repo. It also cleans up some other metadata that no longer does anything.

No changelog is specified because no user facing changes are present.

* [Automation] Update elastic stack version to 8.5.0-41aadc32 for testing (#889)

Co-authored-by: apmmachine <[email protected]>

* Fix/panic with composable renderer (#823)

* Fix a panic with wg passed to the composable object

In the code to retrieve the variables from the configuration files we
need to pass a execution callback, this callback will be called in a
goroutine. This callback can be executed multiple time until the
composable renderer is stopped. There were a problem in the code that
made the callback called multiple time and it made the waitgroup
internal counter to do to a negative values.

This commit change the behavior, it start the composable renderer give
it a callback when the callback receives the variables it will stop the
composable's Run method using the context.

This ensure that the callback will be called a single time and that the
variables are correctly retrieved.

Fixes: #806

* [Automation] Update go release version to 1.18.5 (#832)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-60a4c029 for testing (#899)

Co-authored-by: apmmachine <[email protected]>

* Add control-plane toleration to Agent K8S manifests. (#864)

* Add toleration to elastic-agent Kubernetes manifests.

The toleration with key node-role.kubernetes.io/control-plane is set to replace
the deprecated toleration with key node-role.kubernetes.io/master which will be
removed by Kubernetes v1.25

* Remove outdated "master" node terminology.

* install mage with go install (#936)

* Cloudnative ci automation (#837)

This commit provides the relevant Jenkins CI automation to open Pull requests to kibana github repository in order to keep Cloud-Native teams manifests in sync with the manifests that are used into Fleet UI.

For full information check #706

Updated .ci/Jenkins file that is triggered upon PR requests of /elastic-agent/deploy/kubernetes/* changes
Updated Makefile to add functionality needed to create the extra files for the new prs to kibana remote repository

* Reduce memory footprint by reordering struct elements (#804)

* Reduce memory footprint by reordering struct elements

* rename struct element for linter

Signed-off-by: Florian Lehner <[email protected]>

Signed-off-by: Florian Lehner <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-6b9f92c0 for testing (#948)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-0616acda for testing (#963)

Co-authored-by: apmmachine <[email protected]>

* Clarify that this repo is not only docs (#969)

* Add Filebeat lumberjack input to spec (#959)

Make the lumberjack input available from Agent.

Relates: https://github.com/elastic/beats/pull/32175

* [Automation] Update elastic stack version to 8.5.0-dd6f2bb0 for testing (#978)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-feb644de for testing (#988)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-7783a03c for testing (#1004)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-17b8a62d for testing (#1014)

Co-authored-by: apmmachine <[email protected]>

* update ironbank image product name (#1009)

This is required to automate the creation of the ironbank merge requests as the ubireleaser is using this field to compute the elastic-agent artifact url. 

For example it is now trying to retrieve https://artifacts.elastic.co/downloads/beats/elastic-agent-8.4.0-linux-x86_64.tar.gz instead of https://artifacts.elastic.co/downloads/beats/elastic-agent/elastic-agent-8.4.0-linux-x86_64.tar.gz

* ci: add extended support for windows (#683)

* [Automation] Update elastic stack version to 8.5.0-9aed3b11 for testing (#1030)

Co-authored-by: apmmachine <[email protected]>

* Cloudnative ci utomation (#1035)

* Updating Jenkinsfile and Makefile to open PR

* Adding needed token-id

* [Automation] Update elastic stack version to 8.5.0-fedc3e60 for testing (#1054)

Co-authored-by: apmmachine <[email protected]>

* Testing PR creation for 706 (#1049)

* Fix lookup issues with inputs.d fragment yml (#840)

* Fix lookup issues with inputs.d fragment yml

The Elastic Agent was looking next to the binary for the `inputs.d`
folder instead it should look up into the `Home` folder where
the Elastic Agent symlink is located.

Fixes: #663

* Changelog

* Fix input.d path, tie to the agent Config() directory

* Update CHANGELOG to reflect that the agent configuration directory is used to locate the inputs.d directory

Co-authored-by: Aleksandr Maus <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-b5001a6d for testing (#1064)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-1bd77fc1 for testing (#1082)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-167dfc80 for testing (#1091)

Co-authored-by: apmmachine <[email protected]>

* Adding support for v1.25.0 k8s (#1044)

* Adding support for v1.25.0 k8s

* [Automation] Update elastic stack version to 8.5.0-6b7dda2d for testing (#1101)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-4140365c for testing (#1114)

Co-authored-by: apmmachine <[email protected]>

* Remove experimental warning log in upgrade command (#1106)

* Update go.mod to Go 1.18, update notice. (#1120)

* Remove the fleet reporter (#1130)

* Remove the fleet reporter

Remove the fleet-reporter so that checkins no longer deliver the event
list.

* add CHANGELOG fix tests

* [Automation] Update elastic stack version to 8.5.0-589a4a10 for testing (#1147)

Co-authored-by: apmmachine <[email protected]>

* Avoid reporting `Unhealthy` on fleet connectivity issues (#1152)

Avoid reporting `Unhealthy` on fleet connectivity issues (#1152)

* ci: enable MacOS M1 stages (#1123)

* [Automation] Update go release version to 1.18.6 (#1143)

* [Automation] Update elastic stack version to 8.5.0-37418cf3 for testing (#1165)

Co-authored-by: apmmachine <[email protected]>

* Remove mage notice in favour of make notice (#1108)

The current implementation of mage notice is not working because it
was never finalised, the fact that it and `make notice` exist only
generates confusion.

This commit removes the `mage notice` and documents that `make notice`
should be used instead for the time being.

In the long run we want to use the implementation on
`elastic-agent-libs`, however it is not working at the moment.

Closes #1107

Co-authored-by: Craig MacKenzie <[email protected]>

* ci: run e2e-testing at the end (#1169)

* ci: move macos to github actions (#1175)

* [Automation] Update elastic stack version to 8.5.0-fcf3d4c2 for testing (#1183)

Co-authored-by: apmmachine <[email protected]>

* Add support for hints' based autodiscovery in kubernetes provider (#698)

* ci: increase timeout (#1190)

* Fixing condition for PR creation (#1188)

* Fix leftover log level (#1194)

* [automation] Publish kubernetes templates for elastic-agent (#1192)

Co-authored-by: apmmachine <[email protected]>

* ci: force GO_VERSION (#1204)

* Fix whitespaces in vault_darwin.c (#1206)

* Update kubernetes templates for elastic-agent [templates.d] (#1231)

* Use at least warning level for all status logs (#1218)

* Update k8s manifests to leverage hints (#1202)

* Add Go 1.18 upgrade to breaking changes section. (#1216)

* Add Go 1.18 upgrade to breaking changes section.

* Fix the PR number in the changelog.

* [Release] add-backport-next (#1254)

* Bump version to 8.6.0. (#1259)

* [Automation] Update elastic stack version to 8.5.0-7dc445a0 for testing (#1248)

Co-authored-by: apmmachine <[email protected]>

* Fix: Endpoint collision between monitoring and regular beats  (#1034)

Fix: Endpoint collision between monitoring and regular beats  (#1034)

* internal/pkg/agent/cmd: don't format error message with nil errors (#1240)

The failure conditions allow nil errors to result in an error being formatted,
when formatting due to a non-accepted HTTP status code and a nil error, omit the
error.

Co-authored-by: Craig MacKenzie <[email protected]>

* [Automation] Update elastic stack version to 8.6.0-21651da3 for testing (#1290)

Co-authored-by: apmmachine <[email protected]>

* Fixed: source uri reload for download/verify components (#1252)

Fixed: source uri reload for download/verify components (#1252)

* Expand status reporter/controller interfaces to allow local reporters (#1285)

* Expand status reporter/controller interfaces to allow local reporters

Add a local reporter map to the status controller. These reporters are
not used when updating status with fleet-server, they are only used to
gather local state information - specifically if the agent is degraded
because checkin with fleet-server has failed. This bypasses the bug that
was introduced with the liveness endpoint where the agent could checkin
(to fleet-server) with a degraded status because a previous checkin
failed. Local reporters are used to generate a separate status. This
status is used in the liveness endpoint.

* fix linter

* Improve logging for agent upgrades. (#1287)

* [Automation] Update elastic stack version to 8.6.0-326f84b0 for testing (#1318)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.6.0-df00693f for testing (#1334)

Co-authored-by: apmmachine <[email protected]>

* Add success log message after previous checkin failures (#1327)

* Fix status reporter initialization (#1341)

* [Automation] Update elastic stack version to 8.6.0-a2f4f140 for testing (#1362)

Co-authored-by: apmmachine <[email protected]>

* Added status message to CheckinRequest (#1369)

* Added status message to CheckinRequest

* added changelog

* updated test

* added omitempty

* Fix failures when using npipe monitoring endpoints (#1371)

* [Automation] Update elastic stack version to 8.6.0-158a13db for testing (#1379)

Co-authored-by: apmmachine <[email protected]>

* Mount /etc directory in Kubernetes DaemonSet manifests. (#1382)

Changes made to files like `/etc/passwd` using Linux tools like
`useradd` are not reflected in the mounted file on the Agent,
because the tool replaces the file instead of changing it
in-place.

Mounting the parent directory solves this problem.

* [Automation] Update elastic stack version to 8.6.0-aea1c645 for testing (#1405)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.6.0-0fca2953 for testing (#1412)

Co-authored-by: apmmachine <[email protected]>

* ci: 7.17 is not available for the daily run (#1417)

* [Automation] Update elastic stack version to 8.6.0-e4c15f15 for testing (#1425)

Co-authored-by: apmmachine <[email protected]>

* [backport main] Fix: Agent failed to upgrade from 8.4.2 to 8.5.0 BC1 for MAC 12 agent using agent binary. (#1401)

[backport main] Fix: Agent failed to upgrade from 8.4.2 to 8.5.0 BC1 for MAC 12 agent using agent binary. (#1401)

* Fix docker provider add_fields processors (#1420)

The Docker provider was using a wrong key when defining the
`add_fields` processor, this causes Filebeat not to start the input
and stay on a unhealthy state.

This commig fixes it.

Fixes https://github.com/elastic/beats/issues/29030

* [Automation] Update elastic stack version to 8.6.0-d939cfde for testing (#1436)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.6.0-7c9f25a9 for testing (#1446)

Co-authored-by: apmmachine <[email protected]>

* Enable integration only when datastreams are not defined (#1456)

* Add not dedoted k8s pod labels in autodiscover provider to be used for templating, exactly like annotations (#1398)

* [Automation] Update elastic stack version to 8.6.0-c49fac70 for testing (#1464)

Co-authored-by: apmmachine <[email protected]>

* Add storageclass permissions in agent clusterrole (#1470)

* Add storageclass permissions in agent clusterrole

* Remote QA-labels automation (#1455)

* [Automation] Update go release version to 1.18.7 (#1444)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.6.0-5a8d757d for testing (#1480)

Co-authored-by: apmmachine <[email protected]>

* Improve logging around agent checkins. (#1477)

Improve logging around agent checkins.

- Log transient checkin errors at Info.
- Upgrade to an Error log after 2 repeated failures.
- Log the wait time for the next retry.
- Only update local state after repeated failures.

* [Automation] Update elastic stack version to 8.6.0-40086bc7 for testing (#1496)

Co-authored-by: apmmachine <[email protected]>

* Fixing makefile check (#1490)

* Fixing makefile check

* action: validate changelog fragment (#1488)

* Allign managed with standalone role (#1500)

* Fix k8s template link versioning (#1504)

* Allighningmanifests (#1507)

* Allign managed with standalone role

* Fixing missing Label

* [Automation] Update elastic stack version to 8.6.0-233dc5d4 for testing (#1515)

Co-authored-by: apmmachine <[email protected]>

* Convert CHANGELOG.next to fragments (#1244)

* [Automation] Update elastic stack version to 8.6.0-54a302f0 for testing (#1531)

Co-authored-by: apmmachine <[email protected]>

* Update the linter configuration. (#1478)

Sync the configuration with the one used in Beats, which has disabled
the majority of the least useful linters already.

* Elastic agent counterpart of https://github.com/elastic/beats/pull/33362 (#1528)

Always use the stack_release label for npm i

No changelog necessary since there are no user-visible changes

This lets us ensure we've carefully reviewed and labeled the version of the @elastic/synthetics NPM library that's bundled in docker images

* [Automation] Update elastic stack version to 8.6.0-cae815eb for testing (#1545)

Co-authored-by: apmmachine <[email protected]>

* Fix admin permission check on localized windows (#1552)

Fix admin permission check on localized windows (#1552)

* Fixes from merge of main.

* Update heartbeat specification to only support elasticsearch.

* Fix bad merge in dockerfile.

Signed-off-by: Florian Lehner <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: Pier-Hugues Pellerin <[email protected]>
Co-authored-by: Denis Rechkunov <[email protected]>
Co-authored-by: Michel Laterman <[email protected]>
Co-authored-by: Aleksandr Maus <[email protected]>
Co-authored-by: Victor Martinez <[email protected]>
Co-authored-by: Manuel de la Peña <[email protected]>
Co-authored-by: Anderson Queiroz <[email protected]>
Co-authored-by: Daniel Araujo Almeida <[email protected]>
Co-authored-by: Mariana Dima <[email protected]>
Co-authored-by: ofiriro3 <[email protected]>
Co-authored-by: Julien Lind <[email protected]>
Co-authored-by: Craig MacKenzie <[email protected]>
Co-authored-by: Tiago Queiroz <[email protected]>
Co-authored-by: Pierre HILBERT <[email protected]>
Co-authored-by: Tetiana Kravchenko <[email protected]>
Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>
Co-authored-by: Michal Pristas <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>
Co-authored-by: Florian Lehner <[email protected]>
Co-authored-by: Andrew Cholakian <[email protected]>
Co-authored-by: Yash Tewari <[email protected]>
Co-authored-by: Quentin Pradet <[email protected]>
Co-authored-by: Andrew Kroh <[email protected]>
Co-authored-by: Julien Mailleret <[email protected]>
Co-authored-by: Josh Dover <[email protected]>
Co-authored-by: Chris Mark <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: Dan Kortschak <[email protected]>
Co-authored-by: Julia Bardi <[email protected]>
Co-authored-by: Edoardo Tenani <[email protected]>

* Add input name alias for cloudbeat integrations (#1596)

* add name alias for cloudbeat

* add anchors for yaml fields

* add EKS input

* Change the stater to include a local flag. (#1308)

* Change the stater to include a local flag.

Change the state reporter to use a local flag that determines if local
errors are included in the resulting state. Assume that configMgr errors
are all local - this effects mainly the fleet_gateway. Allow the gateway
to report an error if a checkin fails. When a checkin fails the local
state reported through the status command and liveness endpoint will
include the error, but checkins to fleet-server will not.

* Add ActionsError() method to config manager

Add a new ActionsError() methdo the the config managers. For the
non-managed instances it will return a nil channel. For the managed
instances it will return the dispatcher error queue directly. Have teh
coordinator gather from this channel as it does for the others and
treat any errors as non-local.

* Fix linter

* Service runtime V2 (#1529)

* Service V2 runtime

* Implements service runtime component for V2.
* Extends endpoint spec with some additional attributes for service start/stop/status checks and creds discovery. The creds discovery logic is taken from V1, cleaned up and extracted into its own file, added utz.
* Implements service uninstall
* Refactors pkg/core/process/process.go adds additional options that are needed for the service_command implementation.
* Changes ComponentsModifier to access raw config, needed for the EndpointComponentModifier
* Injects host.id into configuration, needed for Endpoint
* Injects fleet and policy.revision configuration into the Endpoint input configuration
* Bumps the version to 8.6.0 to make it consistent with current beats V2 branch
* Addresses linter complains on affected files

* Remove the service watcher, all the start/stopping logic

* Add changelog

* Fix typo

* Send STOPPING only upon teardown

* Wait for check-in with timeout before sending stopping on teardown

* Fix the service loop routine blocking on channel after stopped

* Addressed code review comments

* Make linter happy

* Try to fix make check-ci

* Spellcheck runtime README.md

* Remove .Stop timeout from the spec as it is no longer used

* Addressed code review feedback

* Sync components with state during container start (#1653)

* Sync components with state during container start

* path approach

* [V2] Enable support for shippers (#1527)

* Work on adding shipper support.

* Fix fmt.

* Fix reference to spec. Allow shipper to be null but still enabled if key exists.

* Move supported shippers into its own key in the input specification.

* Fix issue in merge.

* Implement fake shipper and add fake shipper output to the fake component.

* Add protoc to the test target.

* Don't generate fake shipper protocol in test.

* Commit fake GRPC into code.

* Add unit test for running with shipper, with sending event between running componentn and running shipper.

* Add docstring for shipper test.

* Add changelog fragement.

* Adjust paths for shipper to work on windows and better on unix.

* Update changelog/fragments/1667571017-Add-support-for-running-the-elastic-agent-shipper.yaml

Co-authored-by: Craig MacKenzie <[email protected]>

* Fix fake/component to connect over npipe on windows.

Co-authored-by: Craig MacKenzie <[email protected]>

* [v2] Merge main into feature-arch-v2 as of Nov 8 (#1694)

* [Automation] Update elastic stack version to 8.4.0-40cff009 for testing (#557)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-5e6770b1 for testing (#564)

Co-authored-by: apmmachine <[email protected]>

* Fix regression and use comma separated values (#560)

Fix regression from https://github.com/elastic/elastic-agent/pull/509

* Change in Jenkinsfile will trigger k8s run (#568)

* [Automation] Update elastic stack version to 8.4.0-da5a1c6d for testing (#573)

Co-authored-by: apmmachine <[email protected]>

* Add `@metadata.input_id` and `@metadata.stream_id` when injecting streams (#527)

These 2 value are going to be used in the shipper to identify where an
event came from in order to apply processors accordingly.

Also, added test cases for the processor to verify the change and updated test cases with the new processor.

* Add filemod times to contents of diagnostics collect command (#570)

* Add filemod times to contents of diagnostics collect command

Add filemod times to the files and directories in the zip archive.
Log files (and sub dirs) will use the modtime returned by the fileinfo
for the source. Others will use the timestamp from when the zip is
created.

* Fix linter

* [Automation] Update elastic stack version to 8.4.0-b13123ee for testing (#581)

Co-authored-by: apmmachine <[email protected]>

* Fix Agent upgrade 8.2->8.3 (#578)

* Fix Agent upgrade 8.2->8.3
* Improve the upgrade encryption handling. Add .yml files cleanup.
* Rollback ActionUpgrade to action_id, add MarkerActionUpgrade adapter struct for marker serialization compatibility

* Update containerd (#577)

* [Automation] Update elastic stack version to 8.4.0-4fe26f2a for testing (#591)

Co-authored-by: apmmachine <[email protected]>

* Set explicit ExitTimeOut for MacOS agent launchd plist (#594)

* Set explicit ExitTimeOut for MacOS agent launchd plist

* [Automation] Update elastic stack version to 8.4.0-2e32a640 for testing (#599)

Co-authored-by: apmmachine <[email protected]>

* ci: enable build notifications as GitHub issues (#595)

* status identifies failing component, fleet gateway may report degraded, liveness endpoint added (#569)

* Add liveness endpoint

Add /liveness route to metrics server. This route will report the status
from pkg/core/status. fleet-gateway will now report a degraded state if
a checkin fails. This may not propogate to fleet-server as a failed
checkin means communications between the agent and the server are not
working. It may also lead to the server reporting degraded for up to 30s
(fleet-server polling time) when teh agent is able to successfully
connect.

* linter fix

* add nolint direcrtive

* Linter fix

* Review feedback, add doc strings

* Rename noop controller file to _test file

* [Automation] Update elastic stack version to 8.4.0-722a7d79 for testing (#607)

Co-authored-by: apmmachine <[email protected]>

* ci: enable flaky test detector (#605)

* [Automation] Update elastic stack version to 8.4.0-210dd487 for testing (#620)

Co-authored-by: apmmachine <[email protected]>

* mergify: remove backport automation for non active branches (#615)

* chore: use elastic-agent profile to run the E2E tests (#610)

* [Automation] Update elastic stack version to 8.4.0-a6aa9f3b for testing (#631)

Co-authored-by: apmmachine <[email protected]>

* add macros pointing to new agent's repo and fix old macro calls (#458)

* Add mount of /etc/machine-id for managed Agent in k8s (#530)

* Set hostPID=true for managed agent in k8s (#528)

* Set hostPID=true for managed agent in k8s

* Add comment on hostPID.

* [Automation] Update elastic stack version to 8.4.0-86cc80f3 for testing (#648)

Co-authored-by: apmmachine <[email protected]>

* Update elastic-agent-libs version: includes restriction on default VerificationMode to `full` (#521)

* update version

* mage fmt update

* update dependency

* update changelog

* redact sensitive information in diagnostics collect command (#566)

* Support Cloudbeat regex input type  (#638)

* support input type with regex

* Update supported.go

* Changing the regex to support backward compatible

* Disable flaky test download test (#641)

* [Automation] Update elastic stack version to 8.4.0-3d206b5d for testing (#656)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-3ad82aa8 for testing (#661)

Co-authored-by: apmmachine <[email protected]>

* jjbb: exclude allowed branches, tags and PRs (#658)

cosmetic change in the description and boolean based

* Update elastic-agent-project-board.yml (#649)

* ci: fix labels that clashes with the Orka workers (#659)

* [Automation] Update elastic stack version to 8.4.0-03bd6f3f for testing (#668)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-533f1e30 for testing (#675)

Co-authored-by: apmmachine <[email protected]>

* Osquerybeat: Fix osquerybeat is not running with logstash output (#674)

* [Automation] Update elastic stack version to 8.4.0-d0a4da44 for testing (#684)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-dd98ded4 for testing (#703)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-164d9a10 for testing (#705)

Co-authored-by: apmmachine <[email protected]>

* Add missing license headers (#711)

* [Automation] Update elastic stack version to 8.4.0-00048b66 for testing (#713)

Co-authored-by: apmmachine <[email protected]>

* Allow - in eql variable names (#710)

* fix to allow dashes in variable names in EQL expressions

extend eql to allow the '-' char to appear in variable names, i.e.,
${data.some-var} and additional test cases to eql, the transpiler, and
the k8s provider to verify this works. Note that the bug was caused by
the EQL limitation, the otehr test cases were added when attempting to
find it.

* Regenerate grammer with antlr 4.7.1, add CHANGELOG

* Fix linter issue

* Fix typo

* Fix transpiler to allow : in dynamic variables. (#680)

Fix transpiler regex to allow ':' characters in dynamic variables so
that users can input "${dynamic.lookup|'fallback.here'}".

Co-authored-by: Aleksan…
blakerouse added a commit that referenced this pull request Nov 28, 2022
* [v2] Add v2 component specification and validation. (#502)

* Add v2 component specification and validation.

* Remove i386 and ppc64el. Update spec for osquerybeat.

* Remove windows/arm64.

* Add component spec command to validate component specifications. (#510)

* [v2] Calculate the expected runtime components from policy (#550)

* Upgrade elastic-agent-client.

* Calculate the expected running components and units from the v2 specification and the current policy.

* Update NOTICE.txt.

* Fix lint from servicable main.go.

* Update GRPC for the agent CLI control protocol. Fix name collision issue.

* Run go mod tidy.

* Fix more lint issues.

* Fix fmt.

* Update logic to always compute model, with err set on each component. Check runtime preventions at model generation time.

* Fix items from code review, and issue on windows test runner.

* Try to cleanup duplication in tests.

* Try 2 of fixing duplicate lint failure, that is not really a duplicate.

* Re-run mage fmt.

* Lint fixes for linux, why different?

* Fix nolint comment.

* Add comment.

* Initial Flat Structure (#544)

Flattening the structure and removing download/install steps for programs. 

Co-authored-by: Aleksandr Maus <[email protected]>

* Generate checksum file for components (#604)

* generating checksum?

* yaml output

* Update dev-tools/mage/common.go

Co-authored-by: Michel Laterman <[email protected]>

* review

* ioutil removal from magefile

Co-authored-by: Michel Laterman <[email protected]>

* V2 Runtime Component Manager (#645)

* Add runtime for command v2 components.

* Fix imports.

* Add tests for watching checkins.

* Fix lint and move checkin period to a configurable timeout.

* Fix tests now that checkin timeout needs to be defined.

* Fix code review and lint.

* [v2] Use the v2 components runtime as the core of the Elastic Agent (#753)

* Add runtime for command v2 components.

* Fix imports.

* Add tests for watching checkins.

* Fix lint and move checkin period to a configurable timeout.

* Fix tests now that checkin timeout needs to be defined.

* Fix code review and lint.

* Work on actually running the v2 runtime.

* Work on switching to the v2 runtime.

* More work on switching to v2 runtime.

* Cleanup some imports.

* More import cleanups.

* Add TODO to FleetServerComponentModifier.

* Remove outdated managed_mode_test.go.

* Fixes from code review and lint.

* [v2] Delete unused code from refactor (#777)

* Add runtime for command v2 components.

* Fix imports.

* Add tests for watching checkins.

* Fix lint and move checkin period to a configurable timeout.

* Fix tests now that checkin timeout needs to be defined.

* Fix code review and lint.

* Work on actually running the v2 runtime.

* Work on switching to the v2 runtime.

* More work on switching to v2 runtime.

* Cleanup some imports.

* More import cleanups.

* Add TODO to FleetServerComponentModifier.

* More cleanup and removals.

* Remove more.

* Delete more unused code.

* Clean up step_download from refactor.

* Remove outdated managed_mode_test.go.

* Fixes from code review and lint.

* Fix lint and missing errcheck.

* [v2] Delete more unused code from v2 transition (#790)

* Remove more unused code that was including already deleted code.

* Fix all unit tests.

* Fix lint.

* More lint fixes, maybe this time?

* More lint.... really?

* Update NOTICE.txt.

* [v2] Merge July 27th main into v2 feature branch (#789)

* [Automation] Update elastic stack version to 8.4.0-40cff009 for testing (#557)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-5e6770b1 for testing (#564)

Co-authored-by: apmmachine <[email protected]>

* Fix regression and use comma separated values (#560)

Fix regression from https://github.com/elastic/elastic-agent/pull/509

* Change in Jenkinsfile will trigger k8s run (#568)

* [Automation] Update elastic stack version to 8.4.0-da5a1c6d for testing (#573)

Co-authored-by: apmmachine <[email protected]>

* Add `@metadata.input_id` and `@metadata.stream_id` when injecting streams (#527)

These 2 value are going to be used in the shipper to identify where an
event came from in order to apply processors accordingly.

Also, added test cases for the processor to verify the change and updated test cases with the new processor.

* Add filemod times to contents of diagnostics collect command (#570)

* Add filemod times to contents of diagnostics collect command

Add filemod times to the files and directories in the zip archive.
Log files (and sub dirs) will use the modtime returned by the fileinfo
for the source. Others will use the timestamp from when the zip is
created.

* Fix linter

* [Automation] Update elastic stack version to 8.4.0-b13123ee for testing (#581)

Co-authored-by: apmmachine <[email protected]>

* Fix Agent upgrade 8.2->8.3 (#578)

* Fix Agent upgrade 8.2->8.3
* Improve the upgrade encryption handling. Add .yml files cleanup.
* Rollback ActionUpgrade to action_id, add MarkerActionUpgrade adapter struct for marker serialization compatibility

* Update containerd (#577)

* [Automation] Update elastic stack version to 8.4.0-4fe26f2a for testing (#591)

Co-authored-by: apmmachine <[email protected]>

* Set explicit ExitTimeOut for MacOS agent launchd plist (#594)

* Set explicit ExitTimeOut for MacOS agent launchd plist

* [Automation] Update elastic stack version to 8.4.0-2e32a640 for testing (#599)

Co-authored-by: apmmachine <[email protected]>

* ci: enable build notifications as GitHub issues (#595)

* status identifies failing component, fleet gateway may report degraded, liveness endpoint added (#569)

* Add liveness endpoint

Add /liveness route to metrics server. This route will report the status
from pkg/core/status. fleet-gateway will now report a degraded state if
a checkin fails. This may not propogate to fleet-server as a failed
checkin means communications between the agent and the server are not
working. It may also lead to the server reporting degraded for up to 30s
(fleet-server polling time) when teh agent is able to successfully
connect.

* linter fix

* add nolint direcrtive

* Linter fix

* Review feedback, add doc strings

* Rename noop controller file to _test file

* [Automation] Update elastic stack version to 8.4.0-722a7d79 for testing (#607)

Co-authored-by: apmmachine <[email protected]>

* ci: enable flaky test detector (#605)

* [Automation] Update elastic stack version to 8.4.0-210dd487 for testing (#620)

Co-authored-by: apmmachine <[email protected]>

* mergify: remove backport automation for non active branches (#615)

* chore: use elastic-agent profile to run the E2E tests (#610)

* [Automation] Update elastic stack version to 8.4.0-a6aa9f3b for testing (#631)

Co-authored-by: apmmachine <[email protected]>

* add macros pointing to new agent's repo and fix old macro calls (#458)

* Add mount of /etc/machine-id for managed Agent in k8s (#530)

* Set hostPID=true for managed agent in k8s (#528)

* Set hostPID=true for managed agent in k8s

* Add comment on hostPID.

* [Automation] Update elastic stack version to 8.4.0-86cc80f3 for testing (#648)

Co-authored-by: apmmachine <[email protected]>

* Update elastic-agent-libs version: includes restriction on default VerificationMode to `full` (#521)

* update version

* mage fmt update

* update dependency

* update changelog

* redact sensitive information in diagnostics collect command (#566)

* Support Cloudbeat regex input type  (#638)

* support input type with regex

* Update supported.go

* Changing the regex to support backward compatible

* Disable flaky test download test (#641)

* [Automation] Update elastic stack version to 8.4.0-3d206b5d for testing (#656)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-3ad82aa8 for testing (#661)

Co-authored-by: apmmachine <[email protected]>

* jjbb: exclude allowed branches, tags and PRs (#658)

cosmetic change in the description and boolean based

* Update elastic-agent-project-board.yml (#649)

* ci: fix labels that clashes with the Orka workers (#659)

* [Automation] Update elastic stack version to 8.4.0-03bd6f3f for testing (#668)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-533f1e30 for testing (#675)

Co-authored-by: apmmachine <[email protected]>

* Osquerybeat: Fix osquerybeat is not running with logstash output (#674)

* [Automation] Update elastic stack version to 8.4.0-d0a4da44 for testing (#684)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-dd98ded4 for testing (#703)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-164d9a10 for testing (#705)

Co-authored-by: apmmachine <[email protected]>

* Add missing license headers (#711)

* [Automation] Update elastic stack version to 8.4.0-00048b66 for testing (#713)

Co-authored-by: apmmachine <[email protected]>

* Allow - in eql variable names (#710)

* fix to allow dashes in variable names in EQL expressions

extend eql to allow the '-' char to appear in variable names, i.e.,
${data.some-var} and additional test cases to eql, the transpiler, and
the k8s provider to verify this works. Note that the bug was caused by
the EQL limitation, the otehr test cases were added when attempting to
find it.

* Regenerate grammer with antlr 4.7.1, add CHANGELOG

* Fix linter issue

* Fix typo

* Fix transpiler to allow : in dynamic variables. (#680)

Fix transpiler regex to allow ':' characters in dynamic variables so
that users can input "${dynamic.lookup|'fallback.here'}".

Co-authored-by: Aleksandr Maus <[email protected]>

* Fix for the filebeat spec file picking up packetbeat inputs (#700)

* Reproduce filebeat picking up packetbeat inputs

* Filebeat: filter inputs as first input transform.

Move input filtering to be the first input transformation that occurs in
the filebeat spec file. Fixes
https://github.com/elastic/elastic-agent/issues/427.

* Update changelog.

* [Automation] Update elastic stack version to 8.4.0-3cd57abb for testing (#724)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-a324b98b for testing (#727)

Co-authored-by: apmmachine <[email protected]>

* ci: run on MacOS12 (#696)

* [Automation] Update elastic stack version to 8.4.0-31315ca3 for testing (#732)

Co-authored-by: apmmachine <[email protected]>

* fix typo on package command (#734)

This commit fixes the typo in the package command on the README.md.

* Allow / to be used in variable names (#718)

* Allow the / character to be used in variable names.

Allow / to be used in variable names from dynamic providers and eql
expressions. Ensure that k8s providers can provide variables with
slashes in their names.

* run antlr4

* Fix tests

* Fix Elastic Agent non-fleet broken upgrade between 8.3.x releases (#701)

* Fix Elastic Agent non-fleet broken upgrade between 8.3.x releases

* Migrates vault directory on linux and windows to the top directory of the
  agent, so it can be shared without needing the upgrade handler call,
  like for example with side-by-side install/upgrade from .rpm/.deb
* Extended vault to allow read-only open, useful when the vault at particular location needs to be only read not created.

* Correct the typo in the log messages

* Update lint flagged function comment with 'unused', was flagged with 'deadcode' on the previous run

* Address code review feedback

* Add missing import for linux utz

* Change vault path from Top() to Config(), this a better location, next to fleet.enc based on the install/upgrade testing with .rpm/.deb installs

* Fix the missing state migration for .rpm/.deb upgrade. The post install script now performs the migration and creates the symlink after that.

* Fix typo in the postinstall script

* Update the vault migration code, add the agent configuration match check with the agent secret

* [Automation] Update elastic stack version to 8.4.0-31269fd2 for testing (#746)

Co-authored-by: apmmachine <[email protected]>

* wrap errors and fix some docs typo and convention (#743)

* automate the ironbank docker context generation (#679)

* Update README.md

Adding M1 variable to export to be able to build AMD images

* fix flaky (#730)

* Add filestream ID on standalone kubernetes manifest (#742)

This commit add unique IDs for the filestream inputs used by the
Kubernetes integration in the Elastic-Agent standalone
Kubernetes configuration/manifest file.

* Alter github action to run on different OSs (#769)

Alter the linter action to run on different OSs instead of on linux with
the $GOOS env var.

* [Automation] Update elastic stack version to 8.4.0-d058e92f for testing (#771)

Co-authored-by: apmmachine <[email protected]>

* elastic-agent manifests: add comments; add cloudnative team as a codeowner for the k8s manifests (#708)

* managed elastic-agent: add comments; add cloudnative team as a codeowner for the k8s manifests

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add comments to the standalone elastic-agent, similar to the documentation we have https://www.elastic.co/guide/en/fleet/current/running-on-kubernetes-standalone.html

Signed-off-by: Tetiana Kravchenko <[email protected]>

* Apply suggestions from code review

Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>

* remove comment for FLEET_ENROLLMENT_TOKEN; use Needed everywhere instead of Required

Signed-off-by: Tetiana Kravchenko <[email protected]>

* rephrase regarding accessing kube-state-metrics when used third party tools, like kube-rbac-proxy

Signed-off-by: Tetiana Kravchenko <[email protected]>

* run make check

Signed-off-by: Tetiana Kravchenko <[email protected]>

* keep manifests in sync to pass ci check

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add info on where to find FLEET_URL and FLEET_ENROLLMENT_TOKEN

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add links to elastic-agent documentation

Signed-off-by: Tetiana Kravchenko <[email protected]>

* update comment on FLEET_ENROLLMENT_TOKEN

Signed-off-by: Tetiana Kravchenko <[email protected]>

Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>

* [Elastic-Agent] Added source uri reloading (#686)

* Update will cleanup unneeded artifacts. (#752)

* Update will cleanup unneeded artifacts.

The update process will cleanup unneeded artifacts. When an update
starts all artifacts that do not have the current version number in it's
name will be removed. If artifact retrieval fails, downloaded artifacts
are removed. On a successful upgrade, all contents of the downloads dir
will be removed.

* Clean up linter warnings

* Wrap errors

* cleanup tests

* Fix passed version

* Use os.RemoveAll

* ci: propagate e2e-testing errors (#695)

* [Release] add-backport-next (#784)

* Update go.sum.

* Fix upgrade.

* Fix the upgrade artifact reload.

* Fix lint in coordinator.

Co-authored-by: apmmachine <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: Pier-Hugues Pellerin <[email protected]>
Co-authored-by: Denis Rechkunov <[email protected]>
Co-authored-by: Michel Laterman <[email protected]>
Co-authored-by: Aleksandr Maus <[email protected]>
Co-authored-by: Victor Martinez <[email protected]>
Co-authored-by: Manuel de la Peña <[email protected]>
Co-authored-by: Anderson Queiroz <[email protected]>
Co-authored-by: Daniel Araujo Almeida <[email protected]>
Co-authored-by: Mariana Dima <[email protected]>
Co-authored-by: ofiriro3 <[email protected]>
Co-authored-by: Julien Lind <[email protected]>
Co-authored-by: Craig MacKenzie <[email protected]>
Co-authored-by: Tiago Queiroz <[email protected]>
Co-authored-by: Pierre HILBERT <[email protected]>
Co-authored-by: Tetiana Kravchenko <[email protected]>
Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>
Co-authored-by: Michal Pristas <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>

* [v2] Fix inspect command (#805)

* Write the inspect command for v2.

* Fix lint.

* Fix code review. Load inputs from inputs.d for inspect.

* Fix lint.

* Refactor to use errgroup.

* Remove unused struct.

* Expand check-in payload for V2 (#916)

* Expand check-in payload for V2

* Make linter happy

* [v2] Update protocol to use new UnitExpectedConfig. (#850)

* Update v2 protocol to use new UnitExpectedConfig.

* Cleanup.

* Update NOTICE.txt. Lint dupl.

* Fix code review. Ensure type is set to real type and not alias.

* Fix action dispatching that was using ActionType instead of InputType as before (#973)

* Fix bootstrapping a Fleet Server with v2. (#1010)

* Fix bootstrapping a Fleet Server with v2.

* Fix lint.

* Fix tests.

* Query just related files on build (#1045)

* Update main to 8.5.0 (#793) (#1050)

(cherry picked from commit 317e03116aa919d69be97242207ad11a28c826aa)

Co-authored-by: Pier-Hugues Pellerin <[email protected]>

* Create archive directory if it doesn't exist. (#1058)

On an M1 Mac rename seems to fail if the containing directories do not
already exist.

* fixed docker build (#1105)

* V2 command work dir (#1061)

* Fix v2 work directory for command. Add permission check for execution. Add determining root into runtime prevention.

* Add writeable by group and other in check.

* Fix restart and stopping issues in command runtime for failing binaries.

* Fix issue in endpoint spec. Allow an input to not require an ID, but that ID must be unique.

* Remove unused transpiler rules and steps.

* Fix test.

* Fix workDir for windows.

* Reset to checkin period.

* Fix test and code review issues.

* Add extra log message in unit test.

* More fixes from code review.

* Fix test.

* [v2] Move queue management to dispatcher (#1109)

* Move queue management to dispatcher

Move queue management actions to the dispatcher from the fleet-server
in order to help with future work to add a retry mechanism. Add a
PersistedQueue type which wrap the ActionQueue to make persisting the
queue simpler for the consumer.

* Refactor ActionQueue

Refactor ActionQueue to only export methods that are used by consumers.
The priority queue implementation has been changed to an unexported
type. Persistency has been added and the persistedqueue type has been
removed.

* Rename persistedQueue interface to priorityQueue

* Review feedback

* failing to save queue will log message

* Chagne gateway to use copy

* Fix [V2]: Elastic Agent Install is broken. (#1331)

* Fix agent shutdown on SIGINT (#1258)

* Fix agent shutdown on SIGINT

* Update runtime_comm expected check-in handling to eliminate the lock in failure cases

* Remove some buffered channels that are not longer blocking shutdown after the runtime comms fix commit

* Fix the recursive lock on itself in the runtime loop, refactored code to make it cleaner

* Fix the comment typo

* Fixed managed_mode coordination with fleet gateway. Now the gateway errors reading loop waits until gateway exits. Otherwise the gateway shutdown out of sequence blocks on errCh

* Fix linter

* Fix make check-ci

* Fix runner Err() possible race

* Update the runer DoneWithTimeout implementation

* Address code review comments

* [v2] Re-enable diagnostics for Elastic Agent and all components (#1140)

* Add diagnostics back to v2.

* Update pkg/component/runtime/manager.go

Co-authored-by: Anderson Queiroz <[email protected]>

Co-authored-by: Anderson Queiroz <[email protected]>

* Check and create downloads dir before using (#1410)

* [v2] Add upgrade action retry (#1219)

* Add upgrade action retry

Add the ability for the agent to schedule and retry upgrade actions.

The fleetapi actions now define a ScheduledAction, and RetryableAction interface to eliminate the need for stub methods on all different action types. Action queue has been changed to function on scheduled actions. Serialization tests now ensure that that the retry attribute needed by retryable actions works.

Decouple dispatcher from gateway, dispatcher has an errors channel that will return an error for the list of actions that's sent. Gateway has an Actions method that can be used to get the list of actions from the gateway. The managed_mode config manager will link these two components

If a handler returns an error and the action is a RetryableAction, the dispatcher will attempt to schedule a retry. The dispatcher will also ack the action to fleet-server and indicate if it will be retried or has failed (or has been received normally).
For the acker, if a RetryableAction has an error and an attempt count that is greater than 0 it will be acked as retried. If it has an error and an attempt count less than 1 it will be acked as failed.

Co-authored-by: Blake Rouse <[email protected]>

* V1 metrics monitoring for V2 (#1487)

V1 metrics monitoring for V2 (#1487)

* [v2] Merge main on Oct. 18 (#1557)

* [Automation] Update elastic stack version to 8.4.0-40cff009 for testing (#557)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-5e6770b1 for testing (#564)

Co-authored-by: apmmachine <[email protected]>

* Fix regression and use comma separated values (#560)

Fix regression from https://github.com/elastic/elastic-agent/pull/509

* Change in Jenkinsfile will trigger k8s run (#568)

* [Automation] Update elastic stack version to 8.4.0-da5a1c6d for testing (#573)

Co-authored-by: apmmachine <[email protected]>

* Add `@metadata.input_id` and `@metadata.stream_id` when injecting streams (#527)

These 2 value are going to be used in the shipper to identify where an
event came from in order to apply processors accordingly.

Also, added test cases for the processor to verify the change and updated test cases with the new processor.

* Add filemod times to contents of diagnostics collect command (#570)

* Add filemod times to contents of diagnostics collect command

Add filemod times to the files and directories in the zip archive.
Log files (and sub dirs) will use the modtime returned by the fileinfo
for the source. Others will use the timestamp from when the zip is
created.

* Fix linter

* [Automation] Update elastic stack version to 8.4.0-b13123ee for testing (#581)

Co-authored-by: apmmachine <[email protected]>

* Fix Agent upgrade 8.2->8.3 (#578)

* Fix Agent upgrade 8.2->8.3
* Improve the upgrade encryption handling. Add .yml files cleanup.
* Rollback ActionUpgrade to action_id, add MarkerActionUpgrade adapter struct for marker serialization compatibility

* Update containerd (#577)

* [Automation] Update elastic stack version to 8.4.0-4fe26f2a for testing (#591)

Co-authored-by: apmmachine <[email protected]>

* Set explicit ExitTimeOut for MacOS agent launchd plist (#594)

* Set explicit ExitTimeOut for MacOS agent launchd plist

* [Automation] Update elastic stack version to 8.4.0-2e32a640 for testing (#599)

Co-authored-by: apmmachine <[email protected]>

* ci: enable build notifications as GitHub issues (#595)

* status identifies failing component, fleet gateway may report degraded, liveness endpoint added (#569)

* Add liveness endpoint

Add /liveness route to metrics server. This route will report the status
from pkg/core/status. fleet-gateway will now report a degraded state if
a checkin fails. This may not propogate to fleet-server as a failed
checkin means communications between the agent and the server are not
working. It may also lead to the server reporting degraded for up to 30s
(fleet-server polling time) when teh agent is able to successfully
connect.

* linter fix

* add nolint direcrtive

* Linter fix

* Review feedback, add doc strings

* Rename noop controller file to _test file

* [Automation] Update elastic stack version to 8.4.0-722a7d79 for testing (#607)

Co-authored-by: apmmachine <[email protected]>

* ci: enable flaky test detector (#605)

* [Automation] Update elastic stack version to 8.4.0-210dd487 for testing (#620)

Co-authored-by: apmmachine <[email protected]>

* mergify: remove backport automation for non active branches (#615)

* chore: use elastic-agent profile to run the E2E tests (#610)

* [Automation] Update elastic stack version to 8.4.0-a6aa9f3b for testing (#631)

Co-authored-by: apmmachine <[email protected]>

* add macros pointing to new agent's repo and fix old macro calls (#458)

* Add mount of /etc/machine-id for managed Agent in k8s (#530)

* Set hostPID=true for managed agent in k8s (#528)

* Set hostPID=true for managed agent in k8s

* Add comment on hostPID.

* [Automation] Update elastic stack version to 8.4.0-86cc80f3 for testing (#648)

Co-authored-by: apmmachine <[email protected]>

* Update elastic-agent-libs version: includes restriction on default VerificationMode to `full` (#521)

* update version

* mage fmt update

* update dependency

* update changelog

* redact sensitive information in diagnostics collect command (#566)

* Support Cloudbeat regex input type  (#638)

* support input type with regex

* Update supported.go

* Changing the regex to support backward compatible

* Disable flaky test download test (#641)

* [Automation] Update elastic stack version to 8.4.0-3d206b5d for testing (#656)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-3ad82aa8 for testing (#661)

Co-authored-by: apmmachine <[email protected]>

* jjbb: exclude allowed branches, tags and PRs (#658)

cosmetic change in the description and boolean based

* Update elastic-agent-project-board.yml (#649)

* ci: fix labels that clashes with the Orka workers (#659)

* [Automation] Update elastic stack version to 8.4.0-03bd6f3f for testing (#668)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-533f1e30 for testing (#675)

Co-authored-by: apmmachine <[email protected]>

* Osquerybeat: Fix osquerybeat is not running with logstash output (#674)

* [Automation] Update elastic stack version to 8.4.0-d0a4da44 for testing (#684)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-dd98ded4 for testing (#703)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-164d9a10 for testing (#705)

Co-authored-by: apmmachine <[email protected]>

* Add missing license headers (#711)

* [Automation] Update elastic stack version to 8.4.0-00048b66 for testing (#713)

Co-authored-by: apmmachine <[email protected]>

* Allow - in eql variable names (#710)

* fix to allow dashes in variable names in EQL expressions

extend eql to allow the '-' char to appear in variable names, i.e.,
${data.some-var} and additional test cases to eql, the transpiler, and
the k8s provider to verify this works. Note that the bug was caused by
the EQL limitation, the otehr test cases were added when attempting to
find it.

* Regenerate grammer with antlr 4.7.1, add CHANGELOG

* Fix linter issue

* Fix typo

* Fix transpiler to allow : in dynamic variables. (#680)

Fix transpiler regex to allow ':' characters in dynamic variables so
that users can input "${dynamic.lookup|'fallback.here'}".

Co-authored-by: Aleksandr Maus <[email protected]>

* Fix for the filebeat spec file picking up packetbeat inputs (#700)

* Reproduce filebeat picking up packetbeat inputs

* Filebeat: filter inputs as first input transform.

Move input filtering to be the first input transformation that occurs in
the filebeat spec file. Fixes
https://github.com/elastic/elastic-agent/issues/427.

* Update changelog.

* [Automation] Update elastic stack version to 8.4.0-3cd57abb for testing (#724)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-a324b98b for testing (#727)

Co-authored-by: apmmachine <[email protected]>

* ci: run on MacOS12 (#696)

* [Automation] Update elastic stack version to 8.4.0-31315ca3 for testing (#732)

Co-authored-by: apmmachine <[email protected]>

* fix typo on package command (#734)

This commit fixes the typo in the package command on the README.md.

* Allow / to be used in variable names (#718)

* Allow the / character to be used in variable names.

Allow / to be used in variable names from dynamic providers and eql
expressions. Ensure that k8s providers can provide variables with
slashes in their names.

* run antlr4

* Fix tests

* Fix Elastic Agent non-fleet broken upgrade between 8.3.x releases (#701)

* Fix Elastic Agent non-fleet broken upgrade between 8.3.x releases

* Migrates vault directory on linux and windows to the top directory of the
  agent, so it can be shared without needing the upgrade handler call,
  like for example with side-by-side install/upgrade from .rpm/.deb
* Extended vault to allow read-only open, useful when the vault at particular location needs to be only read not created.

* Correct the typo in the log messages

* Update lint flagged function comment with 'unused', was flagged with 'deadcode' on the previous run

* Address code review feedback

* Add missing import for linux utz

* Change vault path from Top() to Config(), this a better location, next to fleet.enc based on the install/upgrade testing with .rpm/.deb installs

* Fix the missing state migration for .rpm/.deb upgrade. The post install script now performs the migration and creates the symlink after that.

* Fix typo in the postinstall script

* Update the vault migration code, add the agent configuration match check with the agent secret

* [Automation] Update elastic stack version to 8.4.0-31269fd2 for testing (#746)

Co-authored-by: apmmachine <[email protected]>

* wrap errors and fix some docs typo and convention (#743)

* automate the ironbank docker context generation (#679)

* Update README.md

Adding M1 variable to export to be able to build AMD images

* fix flaky (#730)

* Add filestream ID on standalone kubernetes manifest (#742)

This commit add unique IDs for the filestream inputs used by the
Kubernetes integration in the Elastic-Agent standalone
Kubernetes configuration/manifest file.

* Alter github action to run on different OSs (#769)

Alter the linter action to run on different OSs instead of on linux with
the $GOOS env var.

* [Automation] Update elastic stack version to 8.4.0-d058e92f for testing (#771)

Co-authored-by: apmmachine <[email protected]>

* elastic-agent manifests: add comments; add cloudnative team as a codeowner for the k8s manifests (#708)

* managed elastic-agent: add comments; add cloudnative team as a codeowner for the k8s manifests

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add comments to the standalone elastic-agent, similar to the documentation we have https://www.elastic.co/guide/en/fleet/current/running-on-kubernetes-standalone.html

Signed-off-by: Tetiana Kravchenko <[email protected]>

* Apply suggestions from code review

Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>

* remove comment for FLEET_ENROLLMENT_TOKEN; use Needed everywhere instead of Required

Signed-off-by: Tetiana Kravchenko <[email protected]>

* rephrase regarding accessing kube-state-metrics when used third party tools, like kube-rbac-proxy

Signed-off-by: Tetiana Kravchenko <[email protected]>

* run make check

Signed-off-by: Tetiana Kravchenko <[email protected]>

* keep manifests in sync to pass ci check

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add info on where to find FLEET_URL and FLEET_ENROLLMENT_TOKEN

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add links to elastic-agent documentation

Signed-off-by: Tetiana Kravchenko <[email protected]>

* update comment on FLEET_ENROLLMENT_TOKEN

Signed-off-by: Tetiana Kravchenko <[email protected]>

Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>

* [Elastic-Agent] Added source uri reloading (#686)

* Update will cleanup unneeded artifacts. (#752)

* Update will cleanup unneeded artifacts.

The update process will cleanup unneeded artifacts. When an update
starts all artifacts that do not have the current version number in it's
name will be removed. If artifact retrieval fails, downloaded artifacts
are removed. On a successful upgrade, all contents of the downloads dir
will be removed.

* Clean up linter warnings

* Wrap errors

* cleanup tests

* Fix passed version

* Use os.RemoveAll

* ci: propagate e2e-testing errors (#695)

* [Release] add-backport-next (#784)

* Update main to 8.5.0 (#793)

* [Automation] Update go release version to 1.17.12 (#726)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-60171339 for testing (#799)

Co-authored-by: apmmachine <[email protected]>

* update dependency elastic/go-structform from v0.0.9 to v0.0.10 (#802)

Signed-off-by: Florian Lehner <[email protected]>

* Fix unpacking of artifact config (#776)

Fix unpacking of artifact config (#776)

* [Automation] Update elastic stack version to 8.5.0-c54c3404 for testing (#826)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-7dbc10f8 for testing (#833)

Co-authored-by: apmmachine <[email protected]>

* Fix RPM/DEB clean install (#816)

* Fix RPM/DEB clean install

* Improve the post install script

* Do not try to copy the state files if the agent directory is the same,
  this causes the error.
* Check the existance of symlink instead of the file it is pointing to
  for the state file migration.

* Update check for symlink existance for the cases where the symlink points to non-existent file

* fix path for auto generated spec file (#859)

Signed-off-by: Florian Lehner <[email protected]>

* Reload downloader client on config change (#848)

Reload downloader client on config change (#848)

* Bundle elastic-agent.app for MacOS, needed to be able to enable the  … (#714)

* Bundle elastic-agent.app for MacOS, needed to be able to enable the  Full Disk Access

* Calm down the linter

* Fix pathing for windows unit test

* crossbuild: add fix to set ulimit for debian images (#856)

Signed-off-by: Florian Lehner <[email protected]>

* [Heartbeat] Cleanup docker install / always add playwright deps (#764)

This is the agent counterpart to elastic/beats#32122

Refactors Dockerfile handling of synthetics deps to rely on playwright install-deps rather than us manually keeping up to date with those. This should fix issues with newer playwrights needing additional deps.

This also cleans up the Dockerfile a good amount, and fixes indentation. Finally, this removes the unused Dockerfile.elastic-agent.tmpl file since agent is now its own repo. It also cleans up some other metadata that no longer does anything.

No changelog is specified because no user facing changes are present.

* [Automation] Update elastic stack version to 8.5.0-41aadc32 for testing (#889)

Co-authored-by: apmmachine <[email protected]>

* Fix/panic with composable renderer (#823)

* Fix a panic with wg passed to the composable object

In the code to retrieve the variables from the configuration files we
need to pass a execution callback, this callback will be called in a
goroutine. This callback can be executed multiple time until the
composable renderer is stopped. There were a problem in the code that
made the callback called multiple time and it made the waitgroup
internal counter to do to a negative values.

This commit change the behavior, it start the composable renderer give
it a callback when the callback receives the variables it will stop the
composable's Run method using the context.

This ensure that the callback will be called a single time and that the
variables are correctly retrieved.

Fixes: #806

* [Automation] Update go release version to 1.18.5 (#832)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-60a4c029 for testing (#899)

Co-authored-by: apmmachine <[email protected]>

* Add control-plane toleration to Agent K8S manifests. (#864)

* Add toleration to elastic-agent Kubernetes manifests.

The toleration with key node-role.kubernetes.io/control-plane is set to replace
the deprecated toleration with key node-role.kubernetes.io/master which will be
removed by Kubernetes v1.25

* Remove outdated "master" node terminology.

* install mage with go install (#936)

* Cloudnative ci automation (#837)

This commit provides the relevant Jenkins CI automation to open Pull requests to kibana github repository in order to keep Cloud-Native teams manifests in sync with the manifests that are used into Fleet UI.

For full information check #706

Updated .ci/Jenkins file that is triggered upon PR requests of /elastic-agent/deploy/kubernetes/* changes
Updated Makefile to add functionality needed to create the extra files for the new prs to kibana remote repository

* Reduce memory footprint by reordering struct elements (#804)

* Reduce memory footprint by reordering struct elements

* rename struct element for linter

Signed-off-by: Florian Lehner <[email protected]>

Signed-off-by: Florian Lehner <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-6b9f92c0 for testing (#948)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-0616acda for testing (#963)

Co-authored-by: apmmachine <[email protected]>

* Clarify that this repo is not only docs (#969)

* Add Filebeat lumberjack input to spec (#959)

Make the lumberjack input available from Agent.

Relates: https://github.com/elastic/beats/pull/32175

* [Automation] Update elastic stack version to 8.5.0-dd6f2bb0 for testing (#978)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-feb644de for testing (#988)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-7783a03c for testing (#1004)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-17b8a62d for testing (#1014)

Co-authored-by: apmmachine <[email protected]>

* update ironbank image product name (#1009)

This is required to automate the creation of the ironbank merge requests as the ubireleaser is using this field to compute the elastic-agent artifact url. 

For example it is now trying to retrieve https://artifacts.elastic.co/downloads/beats/elastic-agent-8.4.0-linux-x86_64.tar.gz instead of https://artifacts.elastic.co/downloads/beats/elastic-agent/elastic-agent-8.4.0-linux-x86_64.tar.gz

* ci: add extended support for windows (#683)

* [Automation] Update elastic stack version to 8.5.0-9aed3b11 for testing (#1030)

Co-authored-by: apmmachine <[email protected]>

* Cloudnative ci utomation (#1035)

* Updating Jenkinsfile and Makefile to open PR

* Adding needed token-id

* [Automation] Update elastic stack version to 8.5.0-fedc3e60 for testing (#1054)

Co-authored-by: apmmachine <[email protected]>

* Testing PR creation for 706 (#1049)

* Fix lookup issues with inputs.d fragment yml (#840)

* Fix lookup issues with inputs.d fragment yml

The Elastic Agent was looking next to the binary for the `inputs.d`
folder instead it should look up into the `Home` folder where
the Elastic Agent symlink is located.

Fixes: #663

* Changelog

* Fix input.d path, tie to the agent Config() directory

* Update CHANGELOG to reflect that the agent configuration directory is used to locate the inputs.d directory

Co-authored-by: Aleksandr Maus <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-b5001a6d for testing (#1064)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-1bd77fc1 for testing (#1082)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-167dfc80 for testing (#1091)

Co-authored-by: apmmachine <[email protected]>

* Adding support for v1.25.0 k8s (#1044)

* Adding support for v1.25.0 k8s

* [Automation] Update elastic stack version to 8.5.0-6b7dda2d for testing (#1101)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-4140365c for testing (#1114)

Co-authored-by: apmmachine <[email protected]>

* Remove experimental warning log in upgrade command (#1106)

* Update go.mod to Go 1.18, update notice. (#1120)

* Remove the fleet reporter (#1130)

* Remove the fleet reporter

Remove the fleet-reporter so that checkins no longer deliver the event
list.

* add CHANGELOG fix tests

* [Automation] Update elastic stack version to 8.5.0-589a4a10 for testing (#1147)

Co-authored-by: apmmachine <[email protected]>

* Avoid reporting `Unhealthy` on fleet connectivity issues (#1152)

Avoid reporting `Unhealthy` on fleet connectivity issues (#1152)

* ci: enable MacOS M1 stages (#1123)

* [Automation] Update go release version to 1.18.6 (#1143)

* [Automation] Update elastic stack version to 8.5.0-37418cf3 for testing (#1165)

Co-authored-by: apmmachine <[email protected]>

* Remove mage notice in favour of make notice (#1108)

The current implementation of mage notice is not working because it
was never finalised, the fact that it and `make notice` exist only
generates confusion.

This commit removes the `mage notice` and documents that `make notice`
should be used instead for the time being.

In the long run we want to use the implementation on
`elastic-agent-libs`, however it is not working at the moment.

Closes #1107

Co-authored-by: Craig MacKenzie <[email protected]>

* ci: run e2e-testing at the end (#1169)

* ci: move macos to github actions (#1175)

* [Automation] Update elastic stack version to 8.5.0-fcf3d4c2 for testing (#1183)

Co-authored-by: apmmachine <[email protected]>

* Add support for hints' based autodiscovery in kubernetes provider (#698)

* ci: increase timeout (#1190)

* Fixing condition for PR creation (#1188)

* Fix leftover log level (#1194)

* [automation] Publish kubernetes templates for elastic-agent (#1192)

Co-authored-by: apmmachine <[email protected]>

* ci: force GO_VERSION (#1204)

* Fix whitespaces in vault_darwin.c (#1206)

* Update kubernetes templates for elastic-agent [templates.d] (#1231)

* Use at least warning level for all status logs (#1218)

* Update k8s manifests to leverage hints (#1202)

* Add Go 1.18 upgrade to breaking changes section. (#1216)

* Add Go 1.18 upgrade to breaking changes section.

* Fix the PR number in the changelog.

* [Release] add-backport-next (#1254)

* Bump version to 8.6.0. (#1259)

* [Automation] Update elastic stack version to 8.5.0-7dc445a0 for testing (#1248)

Co-authored-by: apmmachine <[email protected]>

* Fix: Endpoint collision between monitoring and regular beats  (#1034)

Fix: Endpoint collision between monitoring and regular beats  (#1034)

* internal/pkg/agent/cmd: don't format error message with nil errors (#1240)

The failure conditions allow nil errors to result in an error being formatted,
when formatting due to a non-accepted HTTP status code and a nil error, omit the
error.

Co-authored-by: Craig MacKenzie <[email protected]>

* [Automation] Update elastic stack version to 8.6.0-21651da3 for testing (#1290)

Co-authored-by: apmmachine <[email protected]>

* Fixed: source uri reload for download/verify components (#1252)

Fixed: source uri reload for download/verify components (#1252)

* Expand status reporter/controller interfaces to allow local reporters (#1285)

* Expand status reporter/controller interfaces to allow local reporters

Add a local reporter map to the status controller. These reporters are
not used when updating status with fleet-server, they are only used to
gather local state information - specifically if the agent is degraded
because checkin with fleet-server has failed. This bypasses the bug that
was introduced with the liveness endpoint where the agent could checkin
(to fleet-server) with a degraded status because a previous checkin
failed. Local reporters are used to generate a separate status. This
status is used in the liveness endpoint.

* fix linter

* Improve logging for agent upgrades. (#1287)

* [Automation] Update elastic stack version to 8.6.0-326f84b0 for testing (#1318)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.6.0-df00693f for testing (#1334)

Co-authored-by: apmmachine <[email protected]>

* Add success log message after previous checkin failures (#1327)

* Fix status reporter initialization (#1341)

* [Automation] Update elastic stack version to 8.6.0-a2f4f140 for testing (#1362)

Co-authored-by: apmmachine <[email protected]>

* Added status message to CheckinRequest (#1369)

* Added status message to CheckinRequest

* added changelog

* updated test

* added omitempty

* Fix failures when using npipe monitoring endpoints (#1371)

* [Automation] Update elastic stack version to 8.6.0-158a13db for testing (#1379)

Co-authored-by: apmmachine <[email protected]>

* Mount /etc directory in Kubernetes DaemonSet manifests. (#1382)

Changes made to files like `/etc/passwd` using Linux tools like
`useradd` are not reflected in the mounted file on the Agent,
because the tool replaces the file instead of changing it
in-place.

Mounting the parent directory solves this problem.

* [Automation] Update elastic stack version to 8.6.0-aea1c645 for testing (#1405)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.6.0-0fca2953 for testing (#1412)

Co-authored-by: apmmachine <[email protected]>

* ci: 7.17 is not available for the daily run (#1417)

* [Automation] Update elastic stack version to 8.6.0-e4c15f15 for testing (#1425)

Co-authored-by: apmmachine <[email protected]>

* [backport main] Fix: Agent failed to upgrade from 8.4.2 to 8.5.0 BC1 for MAC 12 agent using agent binary. (#1401)

[backport main] Fix: Agent failed to upgrade from 8.4.2 to 8.5.0 BC1 for MAC 12 agent using agent binary. (#1401)

* Fix docker provider add_fields processors (#1420)

The Docker provider was using a wrong key when defining the
`add_fields` processor, this causes Filebeat not to start the input
and stay on a unhealthy state.

This commig fixes it.

Fixes https://github.com/elastic/beats/issues/29030

* [Automation] Update elastic stack version to 8.6.0-d939cfde for testing (#1436)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.6.0-7c9f25a9 for testing (#1446)

Co-authored-by: apmmachine <[email protected]>

* Enable integration only when datastreams are not defined (#1456)

* Add not dedoted k8s pod labels in autodiscover provider to be used for templating, exactly like annotations (#1398)

* [Automation] Update elastic stack version to 8.6.0-c49fac70 for testing (#1464)

Co-authored-by: apmmachine <[email protected]>

* Add storageclass permissions in agent clusterrole (#1470)

* Add storageclass permissions in agent clusterrole

* Remote QA-labels automation (#1455)

* [Automation] Update go release version to 1.18.7 (#1444)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.6.0-5a8d757d for testing (#1480)

Co-authored-by: apmmachine <[email protected]>

* Improve logging around agent checkins. (#1477)

Improve logging around agent checkins.

- Log transient checkin errors at Info.
- Upgrade to an Error log after 2 repeated failures.
- Log the wait time for the next retry.
- Only update local state after repeated failures.

* [Automation] Update elastic stack version to 8.6.0-40086bc7 for testing (#1496)

Co-authored-by: apmmachine <[email protected]>

* Fixing makefile check (#1490)

* Fixing makefile check

* action: validate changelog fragment (#1488)

* Allign managed with standalone role (#1500)

* Fix k8s template link versioning (#1504)

* Allighningmanifests (#1507)

* Allign managed with standalone role

* Fixing missing Label

* [Automation] Update elastic stack version to 8.6.0-233dc5d4 for testing (#1515)

Co-authored-by: apmmachine <[email protected]>

* Convert CHANGELOG.next to fragments (#1244)

* [Automation] Update elastic stack version to 8.6.0-54a302f0 for testing (#1531)

Co-authored-by: apmmachine <[email protected]>

* Update the linter configuration. (#1478)

Sync the configuration with the one used in Beats, which has disabled
the majority of the least useful linters already.

* Elastic agent counterpart of https://github.com/elastic/beats/pull/33362 (#1528)

Always use the stack_release label for npm i

No changelog necessary since there are no user-visible changes

This lets us ensure we've carefully reviewed and labeled the version of the @elastic/synthetics NPM library that's bundled in docker images

* [Automation] Update elastic stack version to 8.6.0-cae815eb for testing (#1545)

Co-authored-by: apmmachine <[email protected]>

* Fix admin permission check on localized windows (#1552)

Fix admin permission check on localized windows (#1552)

* Fixes from merge of main.

* Update heartbeat specification to only support elasticsearch.

* Fix bad merge in dockerfile.

Signed-off-by: Florian Lehner <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: Pier-Hugues Pellerin <[email protected]>
Co-authored-by: Denis Rechkunov <[email protected]>
Co-authored-by: Michel Laterman <[email protected]>
Co-authored-by: Aleksandr Maus <[email protected]>
Co-authored-by: Victor Martinez <[email protected]>
Co-authored-by: Manuel de la Peña <[email protected]>
Co-authored-by: Anderson Queiroz <[email protected]>
Co-authored-by: Daniel Araujo Almeida <[email protected]>
Co-authored-by: Mariana Dima <[email protected]>
Co-authored-by: ofiriro3 <[email protected]>
Co-authored-by: Julien Lind <[email protected]>
Co-authored-by: Craig MacKenzie <[email protected]>
Co-authored-by: Tiago Queiroz <[email protected]>
Co-authored-by: Pierre HILBERT <[email protected]>
Co-authored-by: Tetiana Kravchenko <[email protected]>
Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>
Co-authored-by: Michal Pristas <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>
Co-authored-by: Florian Lehner <[email protected]>
Co-authored-by: Andrew Cholakian <[email protected]>
Co-authored-by: Yash Tewari <[email protected]>
Co-authored-by: Quentin Pradet <[email protected]>
Co-authored-by: Andrew Kroh <[email protected]>
Co-authored-by: Julien Mailleret <[email protected]>
Co-authored-by: Josh Dover <[email protected]>
Co-authored-by: Chris Mark <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: Dan Kortschak <[email protected]>
Co-authored-by: Julia Bardi <[email protected]>
Co-authored-by: Edoardo Tenani <[email protected]>

* Add input name alias for cloudbeat integrations (#1596)

* add name alias for cloudbeat

* add anchors for yaml fields

* add EKS input

* Change the stater to include a local flag. (#1308)

* Change the stater to include a local flag.

Change the state reporter to use a local flag that determines if local
errors are included in the resulting state. Assume that configMgr errors
are all local - this effects mainly the fleet_gateway. Allow the gateway
to report an error if a checkin fails. When a checkin fails the local
state reported through the status command and liveness endpoint will
include the error, but checkins to fleet-server will not.

* Add ActionsError() method to config manager

Add a new ActionsError() methdo the the config managers. For the
non-managed instances it will return a nil channel. For the managed
instances it will return the dispatcher error queue directly. Have teh
coordinator gather from this channel as it does for the others and
treat any errors as non-local.

* Fix linter

* Service runtime V2 (#1529)

* Service V2 runtime

* Implements service runtime component for V2.
* Extends endpoint spec with some additional attributes for service start/stop/status checks and creds discovery. The creds discovery logic is taken from V1, cleaned up and extracted into its own file, added utz.
* Implements service uninstall
* Refactors pkg/core/process/process.go adds additional options that are needed for the service_command implementation.
* Changes ComponentsModifier to access raw config, needed for the EndpointComponentModifier
* Injects host.id into configuration, needed for Endpoint
* Injects fleet and policy.revision configuration into the Endpoint input configuration
* Bumps the version to 8.6.0 to make it consistent with current beats V2 branch
* Addresses linter complains on affected files

* Remove the service watcher, all the start/stopping logic

* Add changelog

* Fix typo

* Send STOPPING only upon teardown

* Wait for check-in with timeout before sending stopping on teardown

* Fix the service loop routine blocking on channel after stopped

* Addressed code review comments

* Make linter happy

* Try to fix make check-ci

* Spellcheck runtime README.md

* Remove .Stop timeout from the spec as it is no longer used

* Addressed code review feedback

* Sync components with state during container start (#1653)

* Sync components with state during container start

* path approach

* Subprocess reader start.

* Implement io.Writer to handle reading stdout/stderr for spawned components.

* Don't inject logging args to beats components. Always have beats log to stderr.

* Update to v0.2.15 of elastic-agent-libs.

* [V2] Enable support for shippers (#1527)

* Work on adding shipper support.

* Fix fmt.

* Fix reference to spec. Allow shipper to be null but still enabled if key exists.

* Move supported shippers into its own key in the input specification.

* Fix issue in merge.

* Implement fake shipper and add fake shipper output to the fake component.

* Add protoc to the test target.

* Don't generate fake shipper protocol in test.

* Commit fake GRPC into code.

* Add unit test for running with shipper, with sending event between running componentn and running shipper.

* Add docstring for shipper test.

* Add changelog fragement.

* Adjust paths for shipper to work on windows and better on unix.

* Update changelog/fragments/1667571017-Add-support-for-running-the-elastic-agent-shipper.yaml

Co-authored-by: Craig MacKenzie <[email protected]>

* Fix fake/component to connect over npipe on windows.

Co-authored-by: Craig MacKenzie <[email protected]>

* More work on the logging.

* More fixes.

* Change back to streams.

* Fix go.mod.

* Fix import.

* Fix issues with merge of main.

* remove log helper.

* Add NewWithoutConfig.

* Fix the spawned filestream to ingest logs into elasticsearch for monitoring.

* Add changelog entry.

* Remove debug print.

* Update 1669236059-Capture-stdout-stderr-of-all-spawned-components-to-simplify-logging.yaml

Signed-off-by: Florian Lehner <[email protected]>
Co-authored-by: Michal Pristas <[email protected]>
Co-authored-by: Aleksandr Maus <[email protected]>
Co-authored-by: Michel Laterman <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: Pier-Hugues Pellerin <[email protected]>
Co-authored-by: Denis Rechkunov <[email protected]>
Co-authored-by: Victor Martinez <[email protected]>
Co-authored-by: Manuel de la Peña <[email protected]>
Co-authored-by: Anderson Queiroz <[email protected]>
Co-authored-by: Daniel Araujo Almeida <[email protected]>
Co-authored-by: Mariana Dima <[email protected]>
Co-authored-by: ofiriro3 <[email protected]>
Co-authored-by: Julien Lind <[email protected]>
Co-authored-by: Craig MacKenzie <[email protected]>
Co-authored-by: Tiago Queiroz <[email protected]>
Co-authored-by: Pierre HILBERT <[email protected]>
Co-authored-by: Tetiana Kravchenko <[email protected]>
Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Anderson Queiroz <[email protected]>
Co-authored-by: Florian Lehner <[email protected]>
Co-authored-by: Andrew Cholakian <[email protected]>
Co-authored-by: Yash Tewari <[email protected]>
Co-authored-by: Quentin Pradet <[email protected]>
Co-authored-by: Andrew Kroh <[email protected]>
Co-authored-by: Julien Mailleret <[email protected]>
Co-authored-by: Josh Dover <[email protected]>
Co-authored-by: Chris Mark <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: Dan Kortschak <[email protected]>
Co-authored-by: Julia Bardi <[email protected]>
Co-authored-by: Edoardo Tenani <[email protected]>
Co-authored-by: Alex K <[email protected]>
mergify bot pushed a commit that referenced this pull request Nov 28, 2022
* [v2] Add v2 component specification and validation. (#502)

* Add v2 component specification and validation.

* Remove i386 and ppc64el. Update spec for osquerybeat.

* Remove windows/arm64.

* Add component spec command to validate component specifications. (#510)

* [v2] Calculate the expected runtime components from policy (#550)

* Upgrade elastic-agent-client.

* Calculate the expected running components and units from the v2 specification and the current policy.

* Update NOTICE.txt.

* Fix lint from servicable main.go.

* Update GRPC for the agent CLI control protocol. Fix name collision issue.

* Run go mod tidy.

* Fix more lint issues.

* Fix fmt.

* Update logic to always compute model, with err set on each component. Check runtime preventions at model generation time.

* Fix items from code review, and issue on windows test runner.

* Try to cleanup duplication in tests.

* Try 2 of fixing duplicate lint failure, that is not really a duplicate.

* Re-run mage fmt.

* Lint fixes for linux, why different?

* Fix nolint comment.

* Add comment.

* Initial Flat Structure (#544)

Flattening the structure and removing download/install steps for programs.

Co-authored-by: Aleksandr Maus <[email protected]>

* Generate checksum file for components (#604)

* generating checksum?

* yaml output

* Update dev-tools/mage/common.go

Co-authored-by: Michel Laterman <[email protected]>

* review

* ioutil removal from magefile

Co-authored-by: Michel Laterman <[email protected]>

* V2 Runtime Component Manager (#645)

* Add runtime for command v2 components.

* Fix imports.

* Add tests for watching checkins.

* Fix lint and move checkin period to a configurable timeout.

* Fix tests now that checkin timeout needs to be defined.

* Fix code review and lint.

* [v2] Use the v2 components runtime as the core of the Elastic Agent (#753)

* Add runtime for command v2 components.

* Fix imports.

* Add tests for watching checkins.

* Fix lint and move checkin period to a configurable timeout.

* Fix tests now that checkin timeout needs to be defined.

* Fix code review and lint.

* Work on actually running the v2 runtime.

* Work on switching to the v2 runtime.

* More work on switching to v2 runtime.

* Cleanup some imports.

* More import cleanups.

* Add TODO to FleetServerComponentModifier.

* Remove outdated managed_mode_test.go.

* Fixes from code review and lint.

* [v2] Delete unused code from refactor (#777)

* Add runtime for command v2 components.

* Fix imports.

* Add tests for watching checkins.

* Fix lint and move checkin period to a configurable timeout.

* Fix tests now that checkin timeout needs to be defined.

* Fix code review and lint.

* Work on actually running the v2 runtime.

* Work on switching to the v2 runtime.

* More work on switching to v2 runtime.

* Cleanup some imports.

* More import cleanups.

* Add TODO to FleetServerComponentModifier.

* More cleanup and removals.

* Remove more.

* Delete more unused code.

* Clean up step_download from refactor.

* Remove outdated managed_mode_test.go.

* Fixes from code review and lint.

* Fix lint and missing errcheck.

* [v2] Delete more unused code from v2 transition (#790)

* Remove more unused code that was including already deleted code.

* Fix all unit tests.

* Fix lint.

* More lint fixes, maybe this time?

* More lint.... really?

* Update NOTICE.txt.

* [v2] Merge July 27th main into v2 feature branch (#789)

* [Automation] Update elastic stack version to 8.4.0-40cff009 for testing (#557)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-5e6770b1 for testing (#564)

Co-authored-by: apmmachine <[email protected]>

* Fix regression and use comma separated values (#560)

Fix regression from https://github.com/elastic/elastic-agent/pull/509

* Change in Jenkinsfile will trigger k8s run (#568)

* [Automation] Update elastic stack version to 8.4.0-da5a1c6d for testing (#573)

Co-authored-by: apmmachine <[email protected]>

* Add `@metadata.input_id` and `@metadata.stream_id` when injecting streams (#527)

These 2 value are going to be used in the shipper to identify where an
event came from in order to apply processors accordingly.

Also, added test cases for the processor to verify the change and updated test cases with the new processor.

* Add filemod times to contents of diagnostics collect command (#570)

* Add filemod times to contents of diagnostics collect command

Add filemod times to the files and directories in the zip archive.
Log files (and sub dirs) will use the modtime returned by the fileinfo
for the source. Others will use the timestamp from when the zip is
created.

* Fix linter

* [Automation] Update elastic stack version to 8.4.0-b13123ee for testing (#581)

Co-authored-by: apmmachine <[email protected]>

* Fix Agent upgrade 8.2->8.3 (#578)

* Fix Agent upgrade 8.2->8.3
* Improve the upgrade encryption handling. Add .yml files cleanup.
* Rollback ActionUpgrade to action_id, add MarkerActionUpgrade adapter struct for marker serialization compatibility

* Update containerd (#577)

* [Automation] Update elastic stack version to 8.4.0-4fe26f2a for testing (#591)

Co-authored-by: apmmachine <[email protected]>

* Set explicit ExitTimeOut for MacOS agent launchd plist (#594)

* Set explicit ExitTimeOut for MacOS agent launchd plist

* [Automation] Update elastic stack version to 8.4.0-2e32a640 for testing (#599)

Co-authored-by: apmmachine <[email protected]>

* ci: enable build notifications as GitHub issues (#595)

* status identifies failing component, fleet gateway may report degraded, liveness endpoint added (#569)

* Add liveness endpoint

Add /liveness route to metrics server. This route will report the status
from pkg/core/status. fleet-gateway will now report a degraded state if
a checkin fails. This may not propogate to fleet-server as a failed
checkin means communications between the agent and the server are not
working. It may also lead to the server reporting degraded for up to 30s
(fleet-server polling time) when teh agent is able to successfully
connect.

* linter fix

* add nolint direcrtive

* Linter fix

* Review feedback, add doc strings

* Rename noop controller file to _test file

* [Automation] Update elastic stack version to 8.4.0-722a7d79 for testing (#607)

Co-authored-by: apmmachine <[email protected]>

* ci: enable flaky test detector (#605)

* [Automation] Update elastic stack version to 8.4.0-210dd487 for testing (#620)

Co-authored-by: apmmachine <[email protected]>

* mergify: remove backport automation for non active branches (#615)

* chore: use elastic-agent profile to run the E2E tests (#610)

* [Automation] Update elastic stack version to 8.4.0-a6aa9f3b for testing (#631)

Co-authored-by: apmmachine <[email protected]>

* add macros pointing to new agent's repo and fix old macro calls (#458)

* Add mount of /etc/machine-id for managed Agent in k8s (#530)

* Set hostPID=true for managed agent in k8s (#528)

* Set hostPID=true for managed agent in k8s

* Add comment on hostPID.

* [Automation] Update elastic stack version to 8.4.0-86cc80f3 for testing (#648)

Co-authored-by: apmmachine <[email protected]>

* Update elastic-agent-libs version: includes restriction on default VerificationMode to `full` (#521)

* update version

* mage fmt update

* update dependency

* update changelog

* redact sensitive information in diagnostics collect command (#566)

* Support Cloudbeat regex input type  (#638)

* support input type with regex

* Update supported.go

* Changing the regex to support backward compatible

* Disable flaky test download test (#641)

* [Automation] Update elastic stack version to 8.4.0-3d206b5d for testing (#656)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-3ad82aa8 for testing (#661)

Co-authored-by: apmmachine <[email protected]>

* jjbb: exclude allowed branches, tags and PRs (#658)

cosmetic change in the description and boolean based

* Update elastic-agent-project-board.yml (#649)

* ci: fix labels that clashes with the Orka workers (#659)

* [Automation] Update elastic stack version to 8.4.0-03bd6f3f for testing (#668)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-533f1e30 for testing (#675)

Co-authored-by: apmmachine <[email protected]>

* Osquerybeat: Fix osquerybeat is not running with logstash output (#674)

* [Automation] Update elastic stack version to 8.4.0-d0a4da44 for testing (#684)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-dd98ded4 for testing (#703)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-164d9a10 for testing (#705)

Co-authored-by: apmmachine <[email protected]>

* Add missing license headers (#711)

* [Automation] Update elastic stack version to 8.4.0-00048b66 for testing (#713)

Co-authored-by: apmmachine <[email protected]>

* Allow - in eql variable names (#710)

* fix to allow dashes in variable names in EQL expressions

extend eql to allow the '-' char to appear in variable names, i.e.,
${data.some-var} and additional test cases to eql, the transpiler, and
the k8s provider to verify this works. Note that the bug was caused by
the EQL limitation, the otehr test cases were added when attempting to
find it.

* Regenerate grammer with antlr 4.7.1, add CHANGELOG

* Fix linter issue

* Fix typo

* Fix transpiler to allow : in dynamic variables. (#680)

Fix transpiler regex to allow ':' characters in dynamic variables so
that users can input "${dynamic.lookup|'fallback.here'}".

Co-authored-by: Aleksandr Maus <[email protected]>

* Fix for the filebeat spec file picking up packetbeat inputs (#700)

* Reproduce filebeat picking up packetbeat inputs

* Filebeat: filter inputs as first input transform.

Move input filtering to be the first input transformation that occurs in
the filebeat spec file. Fixes
https://github.com/elastic/elastic-agent/issues/427.

* Update changelog.

* [Automation] Update elastic stack version to 8.4.0-3cd57abb for testing (#724)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-a324b98b for testing (#727)

Co-authored-by: apmmachine <[email protected]>

* ci: run on MacOS12 (#696)

* [Automation] Update elastic stack version to 8.4.0-31315ca3 for testing (#732)

Co-authored-by: apmmachine <[email protected]>

* fix typo on package command (#734)

This commit fixes the typo in the package command on the README.md.

* Allow / to be used in variable names (#718)

* Allow the / character to be used in variable names.

Allow / to be used in variable names from dynamic providers and eql
expressions. Ensure that k8s providers can provide variables with
slashes in their names.

* run antlr4

* Fix tests

* Fix Elastic Agent non-fleet broken upgrade between 8.3.x releases (#701)

* Fix Elastic Agent non-fleet broken upgrade between 8.3.x releases

* Migrates vault directory on linux and windows to the top directory of the
  agent, so it can be shared without needing the upgrade handler call,
  like for example with side-by-side install/upgrade from .rpm/.deb
* Extended vault to allow read-only open, useful when the vault at particular location needs to be only read not created.

* Correct the typo in the log messages

* Update lint flagged function comment with 'unused', was flagged with 'deadcode' on the previous run

* Address code review feedback

* Add missing import for linux utz

* Change vault path from Top() to Config(), this a better location, next to fleet.enc based on the install/upgrade testing with .rpm/.deb installs

* Fix the missing state migration for .rpm/.deb upgrade. The post install script now performs the migration and creates the symlink after that.

* Fix typo in the postinstall script

* Update the vault migration code, add the agent configuration match check with the agent secret

* [Automation] Update elastic stack version to 8.4.0-31269fd2 for testing (#746)

Co-authored-by: apmmachine <[email protected]>

* wrap errors and fix some docs typo and convention (#743)

* automate the ironbank docker context generation (#679)

* Update README.md

Adding M1 variable to export to be able to build AMD images

* fix flaky (#730)

* Add filestream ID on standalone kubernetes manifest (#742)

This commit add unique IDs for the filestream inputs used by the
Kubernetes integration in the Elastic-Agent standalone
Kubernetes configuration/manifest file.

* Alter github action to run on different OSs (#769)

Alter the linter action to run on different OSs instead of on linux with
the $GOOS env var.

* [Automation] Update elastic stack version to 8.4.0-d058e92f for testing (#771)

Co-authored-by: apmmachine <[email protected]>

* elastic-agent manifests: add comments; add cloudnative team as a codeowner for the k8s manifests (#708)

* managed elastic-agent: add comments; add cloudnative team as a codeowner for the k8s manifests

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add comments to the standalone elastic-agent, similar to the documentation we have https://www.elastic.co/guide/en/fleet/current/running-on-kubernetes-standalone.html

Signed-off-by: Tetiana Kravchenko <[email protected]>

* Apply suggestions from code review

Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>

* remove comment for FLEET_ENROLLMENT_TOKEN; use Needed everywhere instead of Required

Signed-off-by: Tetiana Kravchenko <[email protected]>

* rephrase regarding accessing kube-state-metrics when used third party tools, like kube-rbac-proxy

Signed-off-by: Tetiana Kravchenko <[email protected]>

* run make check

Signed-off-by: Tetiana Kravchenko <[email protected]>

* keep manifests in sync to pass ci check

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add info on where to find FLEET_URL and FLEET_ENROLLMENT_TOKEN

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add links to elastic-agent documentation

Signed-off-by: Tetiana Kravchenko <[email protected]>

* update comment on FLEET_ENROLLMENT_TOKEN

Signed-off-by: Tetiana Kravchenko <[email protected]>

Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>

* [Elastic-Agent] Added source uri reloading (#686)

* Update will cleanup unneeded artifacts. (#752)

* Update will cleanup unneeded artifacts.

The update process will cleanup unneeded artifacts. When an update
starts all artifacts that do not have the current version number in it's
name will be removed. If artifact retrieval fails, downloaded artifacts
are removed. On a successful upgrade, all contents of the downloads dir
will be removed.

* Clean up linter warnings

* Wrap errors

* cleanup tests

* Fix passed version

* Use os.RemoveAll

* ci: propagate e2e-testing errors (#695)

* [Release] add-backport-next (#784)

* Update go.sum.

* Fix upgrade.

* Fix the upgrade artifact reload.

* Fix lint in coordinator.

Co-authored-by: apmmachine <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: Pier-Hugues Pellerin <[email protected]>
Co-authored-by: Denis Rechkunov <[email protected]>
Co-authored-by: Michel Laterman <[email protected]>
Co-authored-by: Aleksandr Maus <[email protected]>
Co-authored-by: Victor Martinez <[email protected]>
Co-authored-by: Manuel de la Peña <[email protected]>
Co-authored-by: Anderson Queiroz <[email protected]>
Co-authored-by: Daniel Araujo Almeida <[email protected]>
Co-authored-by: Mariana Dima <[email protected]>
Co-authored-by: ofiriro3 <[email protected]>
Co-authored-by: Julien Lind <[email protected]>
Co-authored-by: Craig MacKenzie <[email protected]>
Co-authored-by: Tiago Queiroz <[email protected]>
Co-authored-by: Pierre HILBERT <[email protected]>
Co-authored-by: Tetiana Kravchenko <[email protected]>
Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>
Co-authored-by: Michal Pristas <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>

* [v2] Fix inspect command (#805)

* Write the inspect command for v2.

* Fix lint.

* Fix code review. Load inputs from inputs.d for inspect.

* Fix lint.

* Refactor to use errgroup.

* Remove unused struct.

* Expand check-in payload for V2 (#916)

* Expand check-in payload for V2

* Make linter happy

* [v2] Update protocol to use new UnitExpectedConfig. (#850)

* Update v2 protocol to use new UnitExpectedConfig.

* Cleanup.

* Update NOTICE.txt. Lint dupl.

* Fix code review. Ensure type is set to real type and not alias.

* Fix action dispatching that was using ActionType instead of InputType as before (#973)

* Fix bootstrapping a Fleet Server with v2. (#1010)

* Fix bootstrapping a Fleet Server with v2.

* Fix lint.

* Fix tests.

* Query just related files on build (#1045)

* Update main to 8.5.0 (#793) (#1050)

(cherry picked from commit 317e03116aa919d69be97242207ad11a28c826aa)

Co-authored-by: Pier-Hugues Pellerin <[email protected]>

* Create archive directory if it doesn't exist. (#1058)

On an M1 Mac rename seems to fail if the containing directories do not
already exist.

* fixed docker build (#1105)

* V2 command work dir (#1061)

* Fix v2 work directory for command. Add permission check for execution. Add determining root into runtime prevention.

* Add writeable by group and other in check.

* Fix restart and stopping issues in command runtime for failing binaries.

* Fix issue in endpoint spec. Allow an input to not require an ID, but that ID must be unique.

* Remove unused transpiler rules and steps.

* Fix test.

* Fix workDir for windows.

* Reset to checkin period.

* Fix test and code review issues.

* Add extra log message in unit test.

* More fixes from code review.

* Fix test.

* [v2] Move queue management to dispatcher (#1109)

* Move queue management to dispatcher

Move queue management actions to the dispatcher from the fleet-server
in order to help with future work to add a retry mechanism. Add a
PersistedQueue type which wrap the ActionQueue to make persisting the
queue simpler for the consumer.

* Refactor ActionQueue

Refactor ActionQueue to only export methods that are used by consumers.
The priority queue implementation has been changed to an unexported
type. Persistency has been added and the persistedqueue type has been
removed.

* Rename persistedQueue interface to priorityQueue

* Review feedback

* failing to save queue will log message

* Chagne gateway to use copy

* Fix [V2]: Elastic Agent Install is broken. (#1331)

* Fix agent shutdown on SIGINT (#1258)

* Fix agent shutdown on SIGINT

* Update runtime_comm expected check-in handling to eliminate the lock in failure cases

* Remove some buffered channels that are not longer blocking shutdown after the runtime comms fix commit

* Fix the recursive lock on itself in the runtime loop, refactored code to make it cleaner

* Fix the comment typo

* Fixed managed_mode coordination with fleet gateway. Now the gateway errors reading loop waits until gateway exits. Otherwise the gateway shutdown out of sequence blocks on errCh

* Fix linter

* Fix make check-ci

* Fix runner Err() possible race

* Update the runer DoneWithTimeout implementation

* Address code review comments

* [v2] Re-enable diagnostics for Elastic Agent and all components (#1140)

* Add diagnostics back to v2.

* Update pkg/component/runtime/manager.go

Co-authored-by: Anderson Queiroz <[email protected]>

Co-authored-by: Anderson Queiroz <[email protected]>

* Check and create downloads dir before using (#1410)

* [v2] Add upgrade action retry (#1219)

* Add upgrade action retry

Add the ability for the agent to schedule and retry upgrade actions.

The fleetapi actions now define a ScheduledAction, and RetryableAction interface to eliminate the need for stub methods on all different action types. Action queue has been changed to function on scheduled actions. Serialization tests now ensure that that the retry attribute needed by retryable actions works.

Decouple dispatcher from gateway, dispatcher has an errors channel that will return an error for the list of actions that's sent. Gateway has an Actions method that can be used to get the list of actions from the gateway. The managed_mode config manager will link these two components

If a handler returns an error and the action is a RetryableAction, the dispatcher will attempt to schedule a retry. The dispatcher will also ack the action to fleet-server and indicate if it will be retried or has failed (or has been received normally).
For the acker, if a RetryableAction has an error and an attempt count that is greater than 0 it will be acked as retried. If it has an error and an attempt count less than 1 it will be acked as failed.

Co-authored-by: Blake Rouse <[email protected]>

* V1 metrics monitoring for V2 (#1487)

V1 metrics monitoring for V2 (#1487)

* [v2] Merge main on Oct. 18 (#1557)

* [Automation] Update elastic stack version to 8.4.0-40cff009 for testing (#557)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-5e6770b1 for testing (#564)

Co-authored-by: apmmachine <[email protected]>

* Fix regression and use comma separated values (#560)

Fix regression from https://github.com/elastic/elastic-agent/pull/509

* Change in Jenkinsfile will trigger k8s run (#568)

* [Automation] Update elastic stack version to 8.4.0-da5a1c6d for testing (#573)

Co-authored-by: apmmachine <[email protected]>

* Add `@metadata.input_id` and `@metadata.stream_id` when injecting streams (#527)

These 2 value are going to be used in the shipper to identify where an
event came from in order to apply processors accordingly.

Also, added test cases for the processor to verify the change and updated test cases with the new processor.

* Add filemod times to contents of diagnostics collect command (#570)

* Add filemod times to contents of diagnostics collect command

Add filemod times to the files and directories in the zip archive.
Log files (and sub dirs) will use the modtime returned by the fileinfo
for the source. Others will use the timestamp from when the zip is
created.

* Fix linter

* [Automation] Update elastic stack version to 8.4.0-b13123ee for testing (#581)

Co-authored-by: apmmachine <[email protected]>

* Fix Agent upgrade 8.2->8.3 (#578)

* Fix Agent upgrade 8.2->8.3
* Improve the upgrade encryption handling. Add .yml files cleanup.
* Rollback ActionUpgrade to action_id, add MarkerActionUpgrade adapter struct for marker serialization compatibility

* Update containerd (#577)

* [Automation] Update elastic stack version to 8.4.0-4fe26f2a for testing (#591)

Co-authored-by: apmmachine <[email protected]>

* Set explicit ExitTimeOut for MacOS agent launchd plist (#594)

* Set explicit ExitTimeOut for MacOS agent launchd plist

* [Automation] Update elastic stack version to 8.4.0-2e32a640 for testing (#599)

Co-authored-by: apmmachine <[email protected]>

* ci: enable build notifications as GitHub issues (#595)

* status identifies failing component, fleet gateway may report degraded, liveness endpoint added (#569)

* Add liveness endpoint

Add /liveness route to metrics server. This route will report the status
from pkg/core/status. fleet-gateway will now report a degraded state if
a checkin fails. This may not propogate to fleet-server as a failed
checkin means communications between the agent and the server are not
working. It may also lead to the server reporting degraded for up to 30s
(fleet-server polling time) when teh agent is able to successfully
connect.

* linter fix

* add nolint direcrtive

* Linter fix

* Review feedback, add doc strings

* Rename noop controller file to _test file

* [Automation] Update elastic stack version to 8.4.0-722a7d79 for testing (#607)

Co-authored-by: apmmachine <[email protected]>

* ci: enable flaky test detector (#605)

* [Automation] Update elastic stack version to 8.4.0-210dd487 for testing (#620)

Co-authored-by: apmmachine <[email protected]>

* mergify: remove backport automation for non active branches (#615)

* chore: use elastic-agent profile to run the E2E tests (#610)

* [Automation] Update elastic stack version to 8.4.0-a6aa9f3b for testing (#631)

Co-authored-by: apmmachine <[email protected]>

* add macros pointing to new agent's repo and fix old macro calls (#458)

* Add mount of /etc/machine-id for managed Agent in k8s (#530)

* Set hostPID=true for managed agent in k8s (#528)

* Set hostPID=true for managed agent in k8s

* Add comment on hostPID.

* [Automation] Update elastic stack version to 8.4.0-86cc80f3 for testing (#648)

Co-authored-by: apmmachine <[email protected]>

* Update elastic-agent-libs version: includes restriction on default VerificationMode to `full` (#521)

* update version

* mage fmt update

* update dependency

* update changelog

* redact sensitive information in diagnostics collect command (#566)

* Support Cloudbeat regex input type  (#638)

* support input type with regex

* Update supported.go

* Changing the regex to support backward compatible

* Disable flaky test download test (#641)

* [Automation] Update elastic stack version to 8.4.0-3d206b5d for testing (#656)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-3ad82aa8 for testing (#661)

Co-authored-by: apmmachine <[email protected]>

* jjbb: exclude allowed branches, tags and PRs (#658)

cosmetic change in the description and boolean based

* Update elastic-agent-project-board.yml (#649)

* ci: fix labels that clashes with the Orka workers (#659)

* [Automation] Update elastic stack version to 8.4.0-03bd6f3f for testing (#668)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-533f1e30 for testing (#675)

Co-authored-by: apmmachine <[email protected]>

* Osquerybeat: Fix osquerybeat is not running with logstash output (#674)

* [Automation] Update elastic stack version to 8.4.0-d0a4da44 for testing (#684)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-dd98ded4 for testing (#703)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-164d9a10 for testing (#705)

Co-authored-by: apmmachine <[email protected]>

* Add missing license headers (#711)

* [Automation] Update elastic stack version to 8.4.0-00048b66 for testing (#713)

Co-authored-by: apmmachine <[email protected]>

* Allow - in eql variable names (#710)

* fix to allow dashes in variable names in EQL expressions

extend eql to allow the '-' char to appear in variable names, i.e.,
${data.some-var} and additional test cases to eql, the transpiler, and
the k8s provider to verify this works. Note that the bug was caused by
the EQL limitation, the otehr test cases were added when attempting to
find it.

* Regenerate grammer with antlr 4.7.1, add CHANGELOG

* Fix linter issue

* Fix typo

* Fix transpiler to allow : in dynamic variables. (#680)

Fix transpiler regex to allow ':' characters in dynamic variables so
that users can input "${dynamic.lookup|'fallback.here'}".

Co-authored-by: Aleksandr Maus <[email protected]>

* Fix for the filebeat spec file picking up packetbeat inputs (#700)

* Reproduce filebeat picking up packetbeat inputs

* Filebeat: filter inputs as first input transform.

Move input filtering to be the first input transformation that occurs in
the filebeat spec file. Fixes
https://github.com/elastic/elastic-agent/issues/427.

* Update changelog.

* [Automation] Update elastic stack version to 8.4.0-3cd57abb for testing (#724)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-a324b98b for testing (#727)

Co-authored-by: apmmachine <[email protected]>

* ci: run on MacOS12 (#696)

* [Automation] Update elastic stack version to 8.4.0-31315ca3 for testing (#732)

Co-authored-by: apmmachine <[email protected]>

* fix typo on package command (#734)

This commit fixes the typo in the package command on the README.md.

* Allow / to be used in variable names (#718)

* Allow the / character to be used in variable names.

Allow / to be used in variable names from dynamic providers and eql
expressions. Ensure that k8s providers can provide variables with
slashes in their names.

* run antlr4

* Fix tests

* Fix Elastic Agent non-fleet broken upgrade between 8.3.x releases (#701)

* Fix Elastic Agent non-fleet broken upgrade between 8.3.x releases

* Migrates vault directory on linux and windows to the top directory of the
  agent, so it can be shared without needing the upgrade handler call,
  like for example with side-by-side install/upgrade from .rpm/.deb
* Extended vault to allow read-only open, useful when the vault at particular location needs to be only read not created.

* Correct the typo in the log messages

* Update lint flagged function comment with 'unused', was flagged with 'deadcode' on the previous run

* Address code review feedback

* Add missing import for linux utz

* Change vault path from Top() to Config(), this a better location, next to fleet.enc based on the install/upgrade testing with .rpm/.deb installs

* Fix the missing state migration for .rpm/.deb upgrade. The post install script now performs the migration and creates the symlink after that.

* Fix typo in the postinstall script

* Update the vault migration code, add the agent configuration match check with the agent secret

* [Automation] Update elastic stack version to 8.4.0-31269fd2 for testing (#746)

Co-authored-by: apmmachine <[email protected]>

* wrap errors and fix some docs typo and convention (#743)

* automate the ironbank docker context generation (#679)

* Update README.md

Adding M1 variable to export to be able to build AMD images

* fix flaky (#730)

* Add filestream ID on standalone kubernetes manifest (#742)

This commit add unique IDs for the filestream inputs used by the
Kubernetes integration in the Elastic-Agent standalone
Kubernetes configuration/manifest file.

* Alter github action to run on different OSs (#769)

Alter the linter action to run on different OSs instead of on linux with
the $GOOS env var.

* [Automation] Update elastic stack version to 8.4.0-d058e92f for testing (#771)

Co-authored-by: apmmachine <[email protected]>

* elastic-agent manifests: add comments; add cloudnative team as a codeowner for the k8s manifests (#708)

* managed elastic-agent: add comments; add cloudnative team as a codeowner for the k8s manifests

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add comments to the standalone elastic-agent, similar to the documentation we have https://www.elastic.co/guide/en/fleet/current/running-on-kubernetes-standalone.html

Signed-off-by: Tetiana Kravchenko <[email protected]>

* Apply suggestions from code review

Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>

* remove comment for FLEET_ENROLLMENT_TOKEN; use Needed everywhere instead of Required

Signed-off-by: Tetiana Kravchenko <[email protected]>

* rephrase regarding accessing kube-state-metrics when used third party tools, like kube-rbac-proxy

Signed-off-by: Tetiana Kravchenko <[email protected]>

* run make check

Signed-off-by: Tetiana Kravchenko <[email protected]>

* keep manifests in sync to pass ci check

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add info on where to find FLEET_URL and FLEET_ENROLLMENT_TOKEN

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add links to elastic-agent documentation

Signed-off-by: Tetiana Kravchenko <[email protected]>

* update comment on FLEET_ENROLLMENT_TOKEN

Signed-off-by: Tetiana Kravchenko <[email protected]>

Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>

* [Elastic-Agent] Added source uri reloading (#686)

* Update will cleanup unneeded artifacts. (#752)

* Update will cleanup unneeded artifacts.

The update process will cleanup unneeded artifacts. When an update
starts all artifacts that do not have the current version number in it's
name will be removed. If artifact retrieval fails, downloaded artifacts
are removed. On a successful upgrade, all contents of the downloads dir
will be removed.

* Clean up linter warnings

* Wrap errors

* cleanup tests

* Fix passed version

* Use os.RemoveAll

* ci: propagate e2e-testing errors (#695)

* [Release] add-backport-next (#784)

* Update main to 8.5.0 (#793)

* [Automation] Update go release version to 1.17.12 (#726)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-60171339 for testing (#799)

Co-authored-by: apmmachine <[email protected]>

* update dependency elastic/go-structform from v0.0.9 to v0.0.10 (#802)

Signed-off-by: Florian Lehner <[email protected]>

* Fix unpacking of artifact config (#776)

Fix unpacking of artifact config (#776)

* [Automation] Update elastic stack version to 8.5.0-c54c3404 for testing (#826)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-7dbc10f8 for testing (#833)

Co-authored-by: apmmachine <[email protected]>

* Fix RPM/DEB clean install (#816)

* Fix RPM/DEB clean install

* Improve the post install script

* Do not try to copy the state files if the agent directory is the same,
  this causes the error.
* Check the existance of symlink instead of the file it is pointing to
  for the state file migration.

* Update check for symlink existance for the cases where the symlink points to non-existent file

* fix path for auto generated spec file (#859)

Signed-off-by: Florian Lehner <[email protected]>

* Reload downloader client on config change (#848)

Reload downloader client on config change (#848)

* Bundle elastic-agent.app for MacOS, needed to be able to enable the  … (#714)

* Bundle elastic-agent.app for MacOS, needed to be able to enable the  Full Disk Access

* Calm down the linter

* Fix pathing for windows unit test

* crossbuild: add fix to set ulimit for debian images (#856)

Signed-off-by: Florian Lehner <[email protected]>

* [Heartbeat] Cleanup docker install / always add playwright deps (#764)

This is the agent counterpart to elastic/beats#32122

Refactors Dockerfile handling of synthetics deps to rely on playwright install-deps rather than us manually keeping up to date with those. This should fix issues with newer playwrights needing additional deps.

This also cleans up the Dockerfile a good amount, and fixes indentation. Finally, this removes the unused Dockerfile.elastic-agent.tmpl file since agent is now its own repo. It also cleans up some other metadata that no longer does anything.

No changelog is specified because no user facing changes are present.

* [Automation] Update elastic stack version to 8.5.0-41aadc32 for testing (#889)

Co-authored-by: apmmachine <[email protected]>

* Fix/panic with composable renderer (#823)

* Fix a panic with wg passed to the composable object

In the code to retrieve the variables from the configuration files we
need to pass a execution callback, this callback will be called in a
goroutine. This callback can be executed multiple time until the
composable renderer is stopped. There were a problem in the code that
made the callback called multiple time and it made the waitgroup
internal counter to do to a negative values.

This commit change the behavior, it start the composable renderer give
it a callback when the callback receives the variables it will stop the
composable's Run method using the context.

This ensure that the callback will be called a single time and that the
variables are correctly retrieved.

Fixes: #806

* [Automation] Update go release version to 1.18.5 (#832)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-60a4c029 for testing (#899)

Co-authored-by: apmmachine <[email protected]>

* Add control-plane toleration to Agent K8S manifests. (#864)

* Add toleration to elastic-agent Kubernetes manifests.

The toleration with key node-role.kubernetes.io/control-plane is set to replace
the deprecated toleration with key node-role.kubernetes.io/master which will be
removed by Kubernetes v1.25

* Remove outdated "master" node terminology.

* install mage with go install (#936)

* Cloudnative ci automation (#837)

This commit provides the relevant Jenkins CI automation to open Pull requests to kibana github repository in order to keep Cloud-Native teams manifests in sync with the manifests that are used into Fleet UI.

For full information check #706

Updated .ci/Jenkins file that is triggered upon PR requests of /elastic-agent/deploy/kubernetes/* changes
Updated Makefile to add functionality needed to create the extra files for the new prs to kibana remote repository

* Reduce memory footprint by reordering struct elements (#804)

* Reduce memory footprint by reordering struct elements

* rename struct element for linter

Signed-off-by: Florian Lehner <[email protected]>

Signed-off-by: Florian Lehner <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-6b9f92c0 for testing (#948)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-0616acda for testing (#963)

Co-authored-by: apmmachine <[email protected]>

* Clarify that this repo is not only docs (#969)

* Add Filebeat lumberjack input to spec (#959)

Make the lumberjack input available from Agent.

Relates: https://github.com/elastic/beats/pull/32175

* [Automation] Update elastic stack version to 8.5.0-dd6f2bb0 for testing (#978)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-feb644de for testing (#988)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-7783a03c for testing (#1004)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-17b8a62d for testing (#1014)

Co-authored-by: apmmachine <[email protected]>

* update ironbank image product name (#1009)

This is required to automate the creation of the ironbank merge requests as the ubireleaser is using this field to compute the elastic-agent artifact url.

For example it is now trying to retrieve https://artifacts.elastic.co/downloads/beats/elastic-agent-8.4.0-linux-x86_64.tar.gz instead of https://artifacts.elastic.co/downloads/beats/elastic-agent/elastic-agent-8.4.0-linux-x86_64.tar.gz

* ci: add extended support for windows (#683)

* [Automation] Update elastic stack version to 8.5.0-9aed3b11 for testing (#1030)

Co-authored-by: apmmachine <[email protected]>

* Cloudnative ci utomation (#1035)

* Updating Jenkinsfile and Makefile to open PR

* Adding needed token-id

* [Automation] Update elastic stack version to 8.5.0-fedc3e60 for testing (#1054)

Co-authored-by: apmmachine <[email protected]>

* Testing PR creation for 706 (#1049)

* Fix lookup issues with inputs.d fragment yml (#840)

* Fix lookup issues with inputs.d fragment yml

The Elastic Agent was looking next to the binary for the `inputs.d`
folder instead it should look up into the `Home` folder where
the Elastic Agent symlink is located.

Fixes: #663

* Changelog

* Fix input.d path, tie to the agent Config() directory

* Update CHANGELOG to reflect that the agent configuration directory is used to locate the inputs.d directory

Co-authored-by: Aleksandr Maus <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-b5001a6d for testing (#1064)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-1bd77fc1 for testing (#1082)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-167dfc80 for testing (#1091)

Co-authored-by: apmmachine <[email protected]>

* Adding support for v1.25.0 k8s (#1044)

* Adding support for v1.25.0 k8s

* [Automation] Update elastic stack version to 8.5.0-6b7dda2d for testing (#1101)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-4140365c for testing (#1114)

Co-authored-by: apmmachine <[email protected]>

* Remove experimental warning log in upgrade command (#1106)

* Update go.mod to Go 1.18, update notice. (#1120)

* Remove the fleet reporter (#1130)

* Remove the fleet reporter

Remove the fleet-reporter so that checkins no longer deliver the event
list.

* add CHANGELOG fix tests

* [Automation] Update elastic stack version to 8.5.0-589a4a10 for testing (#1147)

Co-authored-by: apmmachine <[email protected]>

* Avoid reporting `Unhealthy` on fleet connectivity issues (#1152)

Avoid reporting `Unhealthy` on fleet connectivity issues (#1152)

* ci: enable MacOS M1 stages (#1123)

* [Automation] Update go release version to 1.18.6 (#1143)

* [Automation] Update elastic stack version to 8.5.0-37418cf3 for testing (#1165)

Co-authored-by: apmmachine <[email protected]>

* Remove mage notice in favour of make notice (#1108)

The current implementation of mage notice is not working because it
was never finalised, the fact that it and `make notice` exist only
generates confusion.

This commit removes the `mage notice` and documents that `make notice`
should be used instead for the time being.

In the long run we want to use the implementation on
`elastic-agent-libs`, however it is not working at the moment.

Closes #1107

Co-authored-by: Craig MacKenzie <[email protected]>

* ci: run e2e-testing at the end (#1169)

* ci: move macos to github actions (#1175)

* [Automation] Update elastic stack version to 8.5.0-fcf3d4c2 for testing (#1183)

Co-authored-by: apmmachine <[email protected]>

* Add support for hints' based autodiscovery in kubernetes provider (#698)

* ci: increase timeout (#1190)

* Fixing condition for PR creation (#1188)

* Fix leftover log level (#1194)

* [automation] Publish kubernetes templates for elastic-agent (#1192)

Co-authored-by: apmmachine <[email protected]>

* ci: force GO_VERSION (#1204)

* Fix whitespaces in vault_darwin.c (#1206)

* Update kubernetes templates for elastic-agent [templates.d] (#1231)

* Use at least warning level for all status logs (#1218)

* Update k8s manifests to leverage hints (#1202)

* Add Go 1.18 upgrade to breaking changes section. (#1216)

* Add Go 1.18 upgrade to breaking changes section.

* Fix the PR number in the changelog.

* [Release] add-backport-next (#1254)

* Bump version to 8.6.0. (#1259)

* [Automation] Update elastic stack version to 8.5.0-7dc445a0 for testing (#1248)

Co-authored-by: apmmachine <[email protected]>

* Fix: Endpoint collision between monitoring and regular beats  (#1034)

Fix: Endpoint collision between monitoring and regular beats  (#1034)

* internal/pkg/agent/cmd: don't format error message with nil errors (#1240)

The failure conditions allow nil errors to result in an error being formatted,
when formatting due to a non-accepted HTTP status code and a nil error, omit the
error.

Co-authored-by: Craig MacKenzie <[email protected]>

* [Automation] Update elastic stack version to 8.6.0-21651da3 for testing (#1290)

Co-authored-by: apmmachine <[email protected]>

* Fixed: source uri reload for download/verify components (#1252)

Fixed: source uri reload for download/verify components (#1252)

* Expand status reporter/controller interfaces to allow local reporters (#1285)

* Expand status reporter/controller interfaces to allow local reporters

Add a local reporter map to the status controller. These reporters are
not used when updating status with fleet-server, they are only used to
gather local state information - specifically if the agent is degraded
because checkin with fleet-server has failed. This bypasses the bug that
was introduced with the liveness endpoint where the agent could checkin
(to fleet-server) with a degraded status because a previous checkin
failed. Local reporters are used to generate a separate status. This
status is used in the liveness endpoint.

* fix linter

* Improve logging for agent upgrades. (#1287)

* [Automation] Update elastic stack version to 8.6.0-326f84b0 for testing (#1318)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.6.0-df00693f for testing (#1334)

Co-authored-by: apmmachine <[email protected]>

* Add success log message after previous checkin failures (#1327)

* Fix status reporter initialization (#1341)

* [Automation] Update elastic stack version to 8.6.0-a2f4f140 for testing (#1362)

Co-authored-by: apmmachine <[email protected]>

* Added status message to CheckinRequest (#1369)

* Added status message to CheckinRequest

* added changelog

* updated test

* added omitempty

* Fix failures when using npipe monitoring endpoints (#1371)

* [Automation] Update elastic stack version to 8.6.0-158a13db for testing (#1379)

Co-authored-by: apmmachine <[email protected]>

* Mount /etc directory in Kubernetes DaemonSet manifests. (#1382)

Changes made to files like `/etc/passwd` using Linux tools like
`useradd` are not reflected in the mounted file on the Agent,
because the tool replaces the file instead of changing it
in-place.

Mounting the parent directory solves this problem.

* [Automation] Update elastic stack version to 8.6.0-aea1c645 for testing (#1405)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.6.0-0fca2953 for testing (#1412)

Co-authored-by: apmmachine <[email protected]>

* ci: 7.17 is not available for the daily run (#1417)

* [Automation] Update elastic stack version to 8.6.0-e4c15f15 for testing (#1425)

Co-authored-by: apmmachine <[email protected]>

* [backport main] Fix: Agent failed to upgrade from 8.4.2 to 8.5.0 BC1 for MAC 12 agent using agent binary. (#1401)

[backport main] Fix: Agent failed to upgrade from 8.4.2 to 8.5.0 BC1 for MAC 12 agent using agent binary. (#1401)

* Fix docker provider add_fields processors (#1420)

The Docker provider was using a wrong key when defining the
`add_fields` processor, this causes Filebeat not to start the input
and stay on a unhealthy state.

This commig fixes it.

Fixes https://github.com/elastic/beats/issues/29030

* [Automation] Update elastic stack version to 8.6.0-d939cfde for testing (#1436)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.6.0-7c9f25a9 for testing (#1446)

Co-authored-by: apmmachine <[email protected]>

* Enable integration only when datastreams are not defined (#1456)

* Add not dedoted k8s pod labels in autodiscover provider to be used for templating, exactly like annotations (#1398)

* [Automation] Update elastic stack version to 8.6.0-c49fac70 for testing (#1464)

Co-authored-by: apmmachine <[email protected]>

* Add storageclass permissions in agent clusterrole (#1470)

* Add storageclass permissions in agent clusterrole

* Remote QA-labels automation (#1455)

* [Automation] Update go release version to 1.18.7 (#1444)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.6.0-5a8d757d for testing (#1480)

Co-authored-by: apmmachine <[email protected]>

* Improve logging around agent checkins. (#1477)

Improve logging around agent checkins.

- Log transient checkin errors at Info.
- Upgrade to an Error log after 2 repeated failures.
- Log the wait time for the next retry.
- Only update local state after repeated failures.

* [Automation] Update elastic stack version to 8.6.0-40086bc7 for testing (#1496)

Co-authored-by: apmmachine <[email protected]>

* Fixing makefile check (#1490)

* Fixing makefile check

* action: validate changelog fragment (#1488)

* Allign managed with standalone role (#1500)

* Fix k8s template link versioning (#1504)

* Allighningmanifests (#1507)

* Allign managed with standalone role

* Fixing missing Label

* [Automation] Update elastic stack version to 8.6.0-233dc5d4 for testing (#1515)

Co-authored-by: apmmachine <[email protected]>

* Convert CHANGELOG.next to fragments (#1244)

* [Automation] Update elastic stack version to 8.6.0-54a302f0 for testing (#1531)

Co-authored-by: apmmachine <[email protected]>

* Update the linter configuration. (#1478)

Sync the configuration with the one used in Beats, which has disabled
the majority of the least useful linters already.

* Elastic agent counterpart of https://github.com/elastic/beats/pull/33362 (#1528)

Always use the stack_release label for npm i

No changelog necessary since there are no user-visible changes

This lets us ensure we've carefully reviewed and labeled the version of the @elastic/synthetics NPM library that's bundled in docker images

* [Automation] Update elastic stack version to 8.6.0-cae815eb for testing (#1545)

Co-authored-by: apmmachine <[email protected]>

* Fix admin permission check on localized windows (#1552)

Fix admin permission check on localized windows (#1552)

* Fixes from merge of main.

* Update heartbeat specification to only support elasticsearch.

* Fix bad merge in dockerfile.

Signed-off-by: Florian Lehner <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: Pier-Hugues Pellerin <[email protected]>
Co-authored-by: Denis Rechkunov <[email protected]>
Co-authored-by: Michel Laterman <[email protected]>
Co-authored-by: Aleksandr Maus <[email protected]>
Co-authored-by: Victor Martinez <[email protected]>
Co-authored-by: Manuel de la Peña <[email protected]>
Co-authored-by: Anderson Queiroz <[email protected]>
Co-authored-by: Daniel Araujo Almeida <[email protected]>
Co-authored-by: Mariana Dima <[email protected]>
Co-authored-by: ofiriro3 <[email protected]>
Co-authored-by: Julien Lind <[email protected]>
Co-authored-by: Craig MacKenzie <[email protected]>
Co-authored-by: Tiago Queiroz <[email protected]>
Co-authored-by: Pierre HILBERT <[email protected]>
Co-authored-by: Tetiana Kravchenko <[email protected]>
Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>
Co-authored-by: Michal Pristas <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>
Co-authored-by: Florian Lehner <[email protected]>
Co-authored-by: Andrew Cholakian <[email protected]>
Co-authored-by: Yash Tewari <[email protected]>
Co-authored-by: Quentin Pradet <[email protected]>
Co-authored-by: Andrew Kroh <[email protected]>
Co-authored-by: Julien Mailleret <[email protected]>
Co-authored-by: Josh Dover <[email protected]>
Co-authored-by: Chris Mark <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: Dan Kortschak <[email protected]>
Co-authored-by: Julia Bardi <[email protected]>
Co-authored-by: Edoardo Tenani <[email protected]>

* Add input name alias for cloudbeat integrations (#1596)

* add name alias for cloudbeat

* add anchors for yaml fields

* add EKS input

* Change the stater to include a local flag. (#1308)

* Change the stater to include a local flag.

Change the state reporter to use a local flag that determines if local
errors are included in the resulting state. Assume that configMgr errors
are all local - this effects mainly the fleet_gateway. Allow the gateway
to report an error if a checkin fails. When a checkin fails the local
state reported through the status command and liveness endpoint will
include the error, but checkins to fleet-server will not.

* Add ActionsError() method to config manager

Add a new ActionsError() methdo the the config managers. For the
non-managed instances it will return a nil channel. For the managed
instances it will return the dispatcher error queue directly. Have teh
coordinator gather from this channel as it does for the others and
treat any errors as non-local.

* Fix linter

* Service runtime V2 (#1529)

* Service V2 runtime

* Implements service runtime component for V2.
* Extends endpoint spec with some additional attributes for service start/stop/status checks and creds discovery. The creds discovery logic is taken from V1, cleaned up and extracted into its own file, added utz.
* Implements service uninstall
* Refactors pkg/core/process/process.go adds additional options that are needed for the service_command implementation.
* Changes ComponentsModifier to access raw config, needed for the EndpointComponentModifier
* Injects host.id into configuration, needed for Endpoint
* Injects fleet and policy.revision configuration into the Endpoint input configuration
* Bumps the version to 8.6.0 to make it consistent with current beats V2 branch
* Addresses linter complains on affected files

* Remove the service watcher, all the start/stopping logic

* Add changelog

* Fix typo

* Send STOPPING only upon teardown

* Wait for check-in with timeout before sending stopping on teardown

* Fix the service loop routine blocking on channel after stopped

* Addressed code review comments

* Make linter happy

* Try to fix make check-ci

* Spellcheck runtime README.md

* Remove .Stop timeout from the spec as it is no longer used

* Addressed code review feedback

* Sync components with state during container start (#1653)

* Sync components with state during container start

* path approach

* Subprocess reader start.

* Implement io.Writer to handle reading stdout/stderr for spawned components.

* Don't inject logging args to beats components. Always have beats log to stderr.

* Update to v0.2.15 of elastic-agent-libs.

* [V2] Enable support for shippers (#1527)

* Work on adding shipper support.

* Fix fmt.

* Fix reference to spec. Allow shipper to be null but still enabled if key exists.

* Move supported shippers into its own key in the input specification.

* Fix issue in merge.

* Implement fake shipper and add fake shipper output to the fake component.

* Add protoc to the test target.

* Don't generate fake shipper protocol in test.

* Commit fake GRPC into code.

* Add unit test for running with shipper, with sending event between running componentn and running shipper.

* Add docstring for shipper test.

* Add changelog fragement.

* Adjust paths for shipper to work on windows and better on unix.

* Update changelog/fragments/1667571017-Add-support-for-running-the-elastic-agent-shipper.yaml

Co-authored-by: Craig MacKenzie <[email protected]>

* Fix fake/component to connect over npipe on windows.

Co-authored-by: Craig MacKenzie <[email protected]>

* More work on the logging.

* More fixes.

* Change back to streams.

* Fix go.mod.

* Fix import.

* Fix issues with merge of main.

* remove log helper.

* Add NewWithoutConfig.

* Fix the spawned filestream to ingest logs into elasticsearch for monitoring.

* Add changelog entry.

* Remove debug print.

* Update 1669236059-Capture-stdout-stderr-of-all-spawned-components-to-simplify-logging.yaml

Signed-off-by: Florian Lehner <[email protected]>
Co-authored-by: Michal Pristas <[email protected]>
Co-authored-by: Aleksandr Maus <[email protected]>
Co-authored-by: Michel Laterman <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: Pier-Hugues Pellerin <[email protected]>
Co-authored-by: Denis Rechkunov <[email protected]>
Co-authored-by: Victor Martinez <[email protected]>
Co-authored-by: Manuel de la Peña <[email protected]>
Co-authored-by: Anderson Queiroz <[email protected]>
Co-authored-by: Daniel Araujo Almeida <[email protected]>
Co-authored-by: Mariana Dima <[email protected]>
Co-authored-by: ofiriro3 <[email protected]>
Co-authored-by: Julien Lind <[email protected]>
Co-authored-by: Craig MacKenzie <[email protected]>
Co-authored-by: Tiago Queiroz <[email protected]>
Co-authored-by: Pierre HILBERT <[email protected]>
Co-authored-by: Tetiana Kravchenko <[email protected]>
Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Anderson Queiroz <[email protected]>
Co-authored-by: Florian Lehner <[email protected]>
Co-authored-by: Andrew Cholakian <[email protected]>
Co-authored-by: Yash Tewari <[email protected]>
Co-authored-by: Quentin Pradet <[email protected]>
Co-authored-by: Andrew Kroh <[email protected]>
Co-authored-by: Julien Mailleret <[email protected]>
Co-authored-by: Josh Dover <[email protected]>
Co-authored-by: Chris Mark <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: Dan Kortschak <[email protected]>
Co-authored-by: Julia Bardi <[email protected]>
Co-authored-by: Edoardo Tenani <[email protected]>
Co-authored-by: Alex K <[email protected]>
(cherry picked from commit 7a748fa0fdf4ab786583c7a38169d099b58a7c02)
blakerouse added a commit that referenced this pull request Nov 29, 2022
* [v2] Add v2 component specification and validation. (#502)

* Add v2 component specification and validation.

* Remove i386 and ppc64el. Update spec for osquerybeat.

* Remove windows/arm64.

* Add component spec command to validate component specifications. (#510)

* [v2] Calculate the expected runtime components from policy (#550)

* Upgrade elastic-agent-client.

* Calculate the expected running components and units from the v2 specification and the current policy.

* Update NOTICE.txt.

* Fix lint from servicable main.go.

* Update GRPC for the agent CLI control protocol. Fix name collision issue.

* Run go mod tidy.

* Fix more lint issues.

* Fix fmt.

* Update logic to always compute model, with err set on each component. Check runtime preventions at model generation time.

* Fix items from code review, and issue on windows test runner.

* Try to cleanup duplication in tests.

* Try 2 of fixing duplicate lint failure, that is not really a duplicate.

* Re-run mage fmt.

* Lint fixes for linux, why different?

* Fix nolint comment.

* Add comment.

* Initial Flat Structure (#544)

Flattening the structure and removing download/install steps for programs.

Co-authored-by: Aleksandr Maus <[email protected]>

* Generate checksum file for components (#604)

* generating checksum?

* yaml output

* Update dev-tools/mage/common.go

Co-authored-by: Michel Laterman <[email protected]>

* review

* ioutil removal from magefile

Co-authored-by: Michel Laterman <[email protected]>

* V2 Runtime Component Manager (#645)

* Add runtime for command v2 components.

* Fix imports.

* Add tests for watching checkins.

* Fix lint and move checkin period to a configurable timeout.

* Fix tests now that checkin timeout needs to be defined.

* Fix code review and lint.

* [v2] Use the v2 components runtime as the core of the Elastic Agent (#753)

* Add runtime for command v2 components.

* Fix imports.

* Add tests for watching checkins.

* Fix lint and move checkin period to a configurable timeout.

* Fix tests now that checkin timeout needs to be defined.

* Fix code review and lint.

* Work on actually running the v2 runtime.

* Work on switching to the v2 runtime.

* More work on switching to v2 runtime.

* Cleanup some imports.

* More import cleanups.

* Add TODO to FleetServerComponentModifier.

* Remove outdated managed_mode_test.go.

* Fixes from code review and lint.

* [v2] Delete unused code from refactor (#777)

* Add runtime for command v2 components.

* Fix imports.

* Add tests for watching checkins.

* Fix lint and move checkin period to a configurable timeout.

* Fix tests now that checkin timeout needs to be defined.

* Fix code review and lint.

* Work on actually running the v2 runtime.

* Work on switching to the v2 runtime.

* More work on switching to v2 runtime.

* Cleanup some imports.

* More import cleanups.

* Add TODO to FleetServerComponentModifier.

* More cleanup and removals.

* Remove more.

* Delete more unused code.

* Clean up step_download from refactor.

* Remove outdated managed_mode_test.go.

* Fixes from code review and lint.

* Fix lint and missing errcheck.

* [v2] Delete more unused code from v2 transition (#790)

* Remove more unused code that was including already deleted code.

* Fix all unit tests.

* Fix lint.

* More lint fixes, maybe this time?

* More lint.... really?

* Update NOTICE.txt.

* [v2] Merge July 27th main into v2 feature branch (#789)

* [Automation] Update elastic stack version to 8.4.0-40cff009 for testing (#557)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-5e6770b1 for testing (#564)

Co-authored-by: apmmachine <[email protected]>

* Fix regression and use comma separated values (#560)

Fix regression from https://github.com/elastic/elastic-agent/pull/509

* Change in Jenkinsfile will trigger k8s run (#568)

* [Automation] Update elastic stack version to 8.4.0-da5a1c6d for testing (#573)

Co-authored-by: apmmachine <[email protected]>

* Add `@metadata.input_id` and `@metadata.stream_id` when injecting streams (#527)

These 2 value are going to be used in the shipper to identify where an
event came from in order to apply processors accordingly.

Also, added test cases for the processor to verify the change and updated test cases with the new processor.

* Add filemod times to contents of diagnostics collect command (#570)

* Add filemod times to contents of diagnostics collect command

Add filemod times to the files and directories in the zip archive.
Log files (and sub dirs) will use the modtime returned by the fileinfo
for the source. Others will use the timestamp from when the zip is
created.

* Fix linter

* [Automation] Update elastic stack version to 8.4.0-b13123ee for testing (#581)

Co-authored-by: apmmachine <[email protected]>

* Fix Agent upgrade 8.2->8.3 (#578)

* Fix Agent upgrade 8.2->8.3
* Improve the upgrade encryption handling. Add .yml files cleanup.
* Rollback ActionUpgrade to action_id, add MarkerActionUpgrade adapter struct for marker serialization compatibility

* Update containerd (#577)

* [Automation] Update elastic stack version to 8.4.0-4fe26f2a for testing (#591)

Co-authored-by: apmmachine <[email protected]>

* Set explicit ExitTimeOut for MacOS agent launchd plist (#594)

* Set explicit ExitTimeOut for MacOS agent launchd plist

* [Automation] Update elastic stack version to 8.4.0-2e32a640 for testing (#599)

Co-authored-by: apmmachine <[email protected]>

* ci: enable build notifications as GitHub issues (#595)

* status identifies failing component, fleet gateway may report degraded, liveness endpoint added (#569)

* Add liveness endpoint

Add /liveness route to metrics server. This route will report the status
from pkg/core/status. fleet-gateway will now report a degraded state if
a checkin fails. This may not propogate to fleet-server as a failed
checkin means communications between the agent and the server are not
working. It may also lead to the server reporting degraded for up to 30s
(fleet-server polling time) when teh agent is able to successfully
connect.

* linter fix

* add nolint direcrtive

* Linter fix

* Review feedback, add doc strings

* Rename noop controller file to _test file

* [Automation] Update elastic stack version to 8.4.0-722a7d79 for testing (#607)

Co-authored-by: apmmachine <[email protected]>

* ci: enable flaky test detector (#605)

* [Automation] Update elastic stack version to 8.4.0-210dd487 for testing (#620)

Co-authored-by: apmmachine <[email protected]>

* mergify: remove backport automation for non active branches (#615)

* chore: use elastic-agent profile to run the E2E tests (#610)

* [Automation] Update elastic stack version to 8.4.0-a6aa9f3b for testing (#631)

Co-authored-by: apmmachine <[email protected]>

* add macros pointing to new agent's repo and fix old macro calls (#458)

* Add mount of /etc/machine-id for managed Agent in k8s (#530)

* Set hostPID=true for managed agent in k8s (#528)

* Set hostPID=true for managed agent in k8s

* Add comment on hostPID.

* [Automation] Update elastic stack version to 8.4.0-86cc80f3 for testing (#648)

Co-authored-by: apmmachine <[email protected]>

* Update elastic-agent-libs version: includes restriction on default VerificationMode to `full` (#521)

* update version

* mage fmt update

* update dependency

* update changelog

* redact sensitive information in diagnostics collect command (#566)

* Support Cloudbeat regex input type  (#638)

* support input type with regex

* Update supported.go

* Changing the regex to support backward compatible

* Disable flaky test download test (#641)

* [Automation] Update elastic stack version to 8.4.0-3d206b5d for testing (#656)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-3ad82aa8 for testing (#661)

Co-authored-by: apmmachine <[email protected]>

* jjbb: exclude allowed branches, tags and PRs (#658)

cosmetic change in the description and boolean based

* Update elastic-agent-project-board.yml (#649)

* ci: fix labels that clashes with the Orka workers (#659)

* [Automation] Update elastic stack version to 8.4.0-03bd6f3f for testing (#668)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-533f1e30 for testing (#675)

Co-authored-by: apmmachine <[email protected]>

* Osquerybeat: Fix osquerybeat is not running with logstash output (#674)

* [Automation] Update elastic stack version to 8.4.0-d0a4da44 for testing (#684)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-dd98ded4 for testing (#703)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-164d9a10 for testing (#705)

Co-authored-by: apmmachine <[email protected]>

* Add missing license headers (#711)

* [Automation] Update elastic stack version to 8.4.0-00048b66 for testing (#713)

Co-authored-by: apmmachine <[email protected]>

* Allow - in eql variable names (#710)

* fix to allow dashes in variable names in EQL expressions

extend eql to allow the '-' char to appear in variable names, i.e.,
${data.some-var} and additional test cases to eql, the transpiler, and
the k8s provider to verify this works. Note that the bug was caused by
the EQL limitation, the otehr test cases were added when attempting to
find it.

* Regenerate grammer with antlr 4.7.1, add CHANGELOG

* Fix linter issue

* Fix typo

* Fix transpiler to allow : in dynamic variables. (#680)

Fix transpiler regex to allow ':' characters in dynamic variables so
that users can input "${dynamic.lookup|'fallback.here'}".

Co-authored-by: Aleksandr Maus <[email protected]>

* Fix for the filebeat spec file picking up packetbeat inputs (#700)

* Reproduce filebeat picking up packetbeat inputs

* Filebeat: filter inputs as first input transform.

Move input filtering to be the first input transformation that occurs in
the filebeat spec file. Fixes
https://github.com/elastic/elastic-agent/issues/427.

* Update changelog.

* [Automation] Update elastic stack version to 8.4.0-3cd57abb for testing (#724)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-a324b98b for testing (#727)

Co-authored-by: apmmachine <[email protected]>

* ci: run on MacOS12 (#696)

* [Automation] Update elastic stack version to 8.4.0-31315ca3 for testing (#732)

Co-authored-by: apmmachine <[email protected]>

* fix typo on package command (#734)

This commit fixes the typo in the package command on the README.md.

* Allow / to be used in variable names (#718)

* Allow the / character to be used in variable names.

Allow / to be used in variable names from dynamic providers and eql
expressions. Ensure that k8s providers can provide variables with
slashes in their names.

* run antlr4

* Fix tests

* Fix Elastic Agent non-fleet broken upgrade between 8.3.x releases (#701)

* Fix Elastic Agent non-fleet broken upgrade between 8.3.x releases

* Migrates vault directory on linux and windows to the top directory of the
  agent, so it can be shared without needing the upgrade handler call,
  like for example with side-by-side install/upgrade from .rpm/.deb
* Extended vault to allow read-only open, useful when the vault at particular location needs to be only read not created.

* Correct the typo in the log messages

* Update lint flagged function comment with 'unused', was flagged with 'deadcode' on the previous run

* Address code review feedback

* Add missing import for linux utz

* Change vault path from Top() to Config(), this a better location, next to fleet.enc based on the install/upgrade testing with .rpm/.deb installs

* Fix the missing state migration for .rpm/.deb upgrade. The post install script now performs the migration and creates the symlink after that.

* Fix typo in the postinstall script

* Update the vault migration code, add the agent configuration match check with the agent secret

* [Automation] Update elastic stack version to 8.4.0-31269fd2 for testing (#746)

Co-authored-by: apmmachine <[email protected]>

* wrap errors and fix some docs typo and convention (#743)

* automate the ironbank docker context generation (#679)

* Update README.md

Adding M1 variable to export to be able to build AMD images

* fix flaky (#730)

* Add filestream ID on standalone kubernetes manifest (#742)

This commit add unique IDs for the filestream inputs used by the
Kubernetes integration in the Elastic-Agent standalone
Kubernetes configuration/manifest file.

* Alter github action to run on different OSs (#769)

Alter the linter action to run on different OSs instead of on linux with
the $GOOS env var.

* [Automation] Update elastic stack version to 8.4.0-d058e92f for testing (#771)

Co-authored-by: apmmachine <[email protected]>

* elastic-agent manifests: add comments; add cloudnative team as a codeowner for the k8s manifests (#708)

* managed elastic-agent: add comments; add cloudnative team as a codeowner for the k8s manifests

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add comments to the standalone elastic-agent, similar to the documentation we have https://www.elastic.co/guide/en/fleet/current/running-on-kubernetes-standalone.html

Signed-off-by: Tetiana Kravchenko <[email protected]>

* Apply suggestions from code review

Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>

* remove comment for FLEET_ENROLLMENT_TOKEN; use Needed everywhere instead of Required

Signed-off-by: Tetiana Kravchenko <[email protected]>

* rephrase regarding accessing kube-state-metrics when used third party tools, like kube-rbac-proxy

Signed-off-by: Tetiana Kravchenko <[email protected]>

* run make check

Signed-off-by: Tetiana Kravchenko <[email protected]>

* keep manifests in sync to pass ci check

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add info on where to find FLEET_URL and FLEET_ENROLLMENT_TOKEN

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add links to elastic-agent documentation

Signed-off-by: Tetiana Kravchenko <[email protected]>

* update comment on FLEET_ENROLLMENT_TOKEN

Signed-off-by: Tetiana Kravchenko <[email protected]>

Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>

* [Elastic-Agent] Added source uri reloading (#686)

* Update will cleanup unneeded artifacts. (#752)

* Update will cleanup unneeded artifacts.

The update process will cleanup unneeded artifacts. When an update
starts all artifacts that do not have the current version number in it's
name will be removed. If artifact retrieval fails, downloaded artifacts
are removed. On a successful upgrade, all contents of the downloads dir
will be removed.

* Clean up linter warnings

* Wrap errors

* cleanup tests

* Fix passed version

* Use os.RemoveAll

* ci: propagate e2e-testing errors (#695)

* [Release] add-backport-next (#784)

* Update go.sum.

* Fix upgrade.

* Fix the upgrade artifact reload.

* Fix lint in coordinator.

Co-authored-by: apmmachine <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: Pier-Hugues Pellerin <[email protected]>
Co-authored-by: Denis Rechkunov <[email protected]>
Co-authored-by: Michel Laterman <[email protected]>
Co-authored-by: Aleksandr Maus <[email protected]>
Co-authored-by: Victor Martinez <[email protected]>
Co-authored-by: Manuel de la Peña <[email protected]>
Co-authored-by: Anderson Queiroz <[email protected]>
Co-authored-by: Daniel Araujo Almeida <[email protected]>
Co-authored-by: Mariana Dima <[email protected]>
Co-authored-by: ofiriro3 <[email protected]>
Co-authored-by: Julien Lind <[email protected]>
Co-authored-by: Craig MacKenzie <[email protected]>
Co-authored-by: Tiago Queiroz <[email protected]>
Co-authored-by: Pierre HILBERT <[email protected]>
Co-authored-by: Tetiana Kravchenko <[email protected]>
Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>
Co-authored-by: Michal Pristas <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>

* [v2] Fix inspect command (#805)

* Write the inspect command for v2.

* Fix lint.

* Fix code review. Load inputs from inputs.d for inspect.

* Fix lint.

* Refactor to use errgroup.

* Remove unused struct.

* Expand check-in payload for V2 (#916)

* Expand check-in payload for V2

* Make linter happy

* [v2] Update protocol to use new UnitExpectedConfig. (#850)

* Update v2 protocol to use new UnitExpectedConfig.

* Cleanup.

* Update NOTICE.txt. Lint dupl.

* Fix code review. Ensure type is set to real type and not alias.

* Fix action dispatching that was using ActionType instead of InputType as before (#973)

* Fix bootstrapping a Fleet Server with v2. (#1010)

* Fix bootstrapping a Fleet Server with v2.

* Fix lint.

* Fix tests.

* Query just related files on build (#1045)

* Update main to 8.5.0 (#793) (#1050)

(cherry picked from commit 317e03116aa919d69be97242207ad11a28c826aa)

Co-authored-by: Pier-Hugues Pellerin <[email protected]>

* Create archive directory if it doesn't exist. (#1058)

On an M1 Mac rename seems to fail if the containing directories do not
already exist.

* fixed docker build (#1105)

* V2 command work dir (#1061)

* Fix v2 work directory for command. Add permission check for execution. Add determining root into runtime prevention.

* Add writeable by group and other in check.

* Fix restart and stopping issues in command runtime for failing binaries.

* Fix issue in endpoint spec. Allow an input to not require an ID, but that ID must be unique.

* Remove unused transpiler rules and steps.

* Fix test.

* Fix workDir for windows.

* Reset to checkin period.

* Fix test and code review issues.

* Add extra log message in unit test.

* More fixes from code review.

* Fix test.

* [v2] Move queue management to dispatcher (#1109)

* Move queue management to dispatcher

Move queue management actions to the dispatcher from the fleet-server
in order to help with future work to add a retry mechanism. Add a
PersistedQueue type which wrap the ActionQueue to make persisting the
queue simpler for the consumer.

* Refactor ActionQueue

Refactor ActionQueue to only export methods that are used by consumers.
The priority queue implementation has been changed to an unexported
type. Persistency has been added and the persistedqueue type has been
removed.

* Rename persistedQueue interface to priorityQueue

* Review feedback

* failing to save queue will log message

* Chagne gateway to use copy

* Fix [V2]: Elastic Agent Install is broken. (#1331)

* Fix agent shutdown on SIGINT (#1258)

* Fix agent shutdown on SIGINT

* Update runtime_comm expected check-in handling to eliminate the lock in failure cases

* Remove some buffered channels that are not longer blocking shutdown after the runtime comms fix commit

* Fix the recursive lock on itself in the runtime loop, refactored code to make it cleaner

* Fix the comment typo

* Fixed managed_mode coordination with fleet gateway. Now the gateway errors reading loop waits until gateway exits. Otherwise the gateway shutdown out of sequence blocks on errCh

* Fix linter

* Fix make check-ci

* Fix runner Err() possible race

* Update the runer DoneWithTimeout implementation

* Address code review comments

* [v2] Re-enable diagnostics for Elastic Agent and all components (#1140)

* Add diagnostics back to v2.

* Update pkg/component/runtime/manager.go

Co-authored-by: Anderson Queiroz <[email protected]>

Co-authored-by: Anderson Queiroz <[email protected]>

* Check and create downloads dir before using (#1410)

* [v2] Add upgrade action retry (#1219)

* Add upgrade action retry

Add the ability for the agent to schedule and retry upgrade actions.

The fleetapi actions now define a ScheduledAction, and RetryableAction interface to eliminate the need for stub methods on all different action types. Action queue has been changed to function on scheduled actions. Serialization tests now ensure that that the retry attribute needed by retryable actions works.

Decouple dispatcher from gateway, dispatcher has an errors channel that will return an error for the list of actions that's sent. Gateway has an Actions method that can be used to get the list of actions from the gateway. The managed_mode config manager will link these two components

If a handler returns an error and the action is a RetryableAction, the dispatcher will attempt to schedule a retry. The dispatcher will also ack the action to fleet-server and indicate if it will be retried or has failed (or has been received normally).
For the acker, if a RetryableAction has an error and an attempt count that is greater than 0 it will be acked as retried. If it has an error and an attempt count less than 1 it will be acked as failed.

Co-authored-by: Blake Rouse <[email protected]>

* V1 metrics monitoring for V2 (#1487)

V1 metrics monitoring for V2 (#1487)

* [v2] Merge main on Oct. 18 (#1557)

* [Automation] Update elastic stack version to 8.4.0-40cff009 for testing (#557)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-5e6770b1 for testing (#564)

Co-authored-by: apmmachine <[email protected]>

* Fix regression and use comma separated values (#560)

Fix regression from https://github.com/elastic/elastic-agent/pull/509

* Change in Jenkinsfile will trigger k8s run (#568)

* [Automation] Update elastic stack version to 8.4.0-da5a1c6d for testing (#573)

Co-authored-by: apmmachine <[email protected]>

* Add `@metadata.input_id` and `@metadata.stream_id` when injecting streams (#527)

These 2 value are going to be used in the shipper to identify where an
event came from in order to apply processors accordingly.

Also, added test cases for the processor to verify the change and updated test cases with the new processor.

* Add filemod times to contents of diagnostics collect command (#570)

* Add filemod times to contents of diagnostics collect command

Add filemod times to the files and directories in the zip archive.
Log files (and sub dirs) will use the modtime returned by the fileinfo
for the source. Others will use the timestamp from when the zip is
created.

* Fix linter

* [Automation] Update elastic stack version to 8.4.0-b13123ee for testing (#581)

Co-authored-by: apmmachine <[email protected]>

* Fix Agent upgrade 8.2->8.3 (#578)

* Fix Agent upgrade 8.2->8.3
* Improve the upgrade encryption handling. Add .yml files cleanup.
* Rollback ActionUpgrade to action_id, add MarkerActionUpgrade adapter struct for marker serialization compatibility

* Update containerd (#577)

* [Automation] Update elastic stack version to 8.4.0-4fe26f2a for testing (#591)

Co-authored-by: apmmachine <[email protected]>

* Set explicit ExitTimeOut for MacOS agent launchd plist (#594)

* Set explicit ExitTimeOut for MacOS agent launchd plist

* [Automation] Update elastic stack version to 8.4.0-2e32a640 for testing (#599)

Co-authored-by: apmmachine <[email protected]>

* ci: enable build notifications as GitHub issues (#595)

* status identifies failing component, fleet gateway may report degraded, liveness endpoint added (#569)

* Add liveness endpoint

Add /liveness route to metrics server. This route will report the status
from pkg/core/status. fleet-gateway will now report a degraded state if
a checkin fails. This may not propogate to fleet-server as a failed
checkin means communications between the agent and the server are not
working. It may also lead to the server reporting degraded for up to 30s
(fleet-server polling time) when teh agent is able to successfully
connect.

* linter fix

* add nolint direcrtive

* Linter fix

* Review feedback, add doc strings

* Rename noop controller file to _test file

* [Automation] Update elastic stack version to 8.4.0-722a7d79 for testing (#607)

Co-authored-by: apmmachine <[email protected]>

* ci: enable flaky test detector (#605)

* [Automation] Update elastic stack version to 8.4.0-210dd487 for testing (#620)

Co-authored-by: apmmachine <[email protected]>

* mergify: remove backport automation for non active branches (#615)

* chore: use elastic-agent profile to run the E2E tests (#610)

* [Automation] Update elastic stack version to 8.4.0-a6aa9f3b for testing (#631)

Co-authored-by: apmmachine <[email protected]>

* add macros pointing to new agent's repo and fix old macro calls (#458)

* Add mount of /etc/machine-id for managed Agent in k8s (#530)

* Set hostPID=true for managed agent in k8s (#528)

* Set hostPID=true for managed agent in k8s

* Add comment on hostPID.

* [Automation] Update elastic stack version to 8.4.0-86cc80f3 for testing (#648)

Co-authored-by: apmmachine <[email protected]>

* Update elastic-agent-libs version: includes restriction on default VerificationMode to `full` (#521)

* update version

* mage fmt update

* update dependency

* update changelog

* redact sensitive information in diagnostics collect command (#566)

* Support Cloudbeat regex input type  (#638)

* support input type with regex

* Update supported.go

* Changing the regex to support backward compatible

* Disable flaky test download test (#641)

* [Automation] Update elastic stack version to 8.4.0-3d206b5d for testing (#656)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-3ad82aa8 for testing (#661)

Co-authored-by: apmmachine <[email protected]>

* jjbb: exclude allowed branches, tags and PRs (#658)

cosmetic change in the description and boolean based

* Update elastic-agent-project-board.yml (#649)

* ci: fix labels that clashes with the Orka workers (#659)

* [Automation] Update elastic stack version to 8.4.0-03bd6f3f for testing (#668)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-533f1e30 for testing (#675)

Co-authored-by: apmmachine <[email protected]>

* Osquerybeat: Fix osquerybeat is not running with logstash output (#674)

* [Automation] Update elastic stack version to 8.4.0-d0a4da44 for testing (#684)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-dd98ded4 for testing (#703)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-164d9a10 for testing (#705)

Co-authored-by: apmmachine <[email protected]>

* Add missing license headers (#711)

* [Automation] Update elastic stack version to 8.4.0-00048b66 for testing (#713)

Co-authored-by: apmmachine <[email protected]>

* Allow - in eql variable names (#710)

* fix to allow dashes in variable names in EQL expressions

extend eql to allow the '-' char to appear in variable names, i.e.,
${data.some-var} and additional test cases to eql, the transpiler, and
the k8s provider to verify this works. Note that the bug was caused by
the EQL limitation, the otehr test cases were added when attempting to
find it.

* Regenerate grammer with antlr 4.7.1, add CHANGELOG

* Fix linter issue

* Fix typo

* Fix transpiler to allow : in dynamic variables. (#680)

Fix transpiler regex to allow ':' characters in dynamic variables so
that users can input "${dynamic.lookup|'fallback.here'}".

Co-authored-by: Aleksandr Maus <[email protected]>

* Fix for the filebeat spec file picking up packetbeat inputs (#700)

* Reproduce filebeat picking up packetbeat inputs

* Filebeat: filter inputs as first input transform.

Move input filtering to be the first input transformation that occurs in
the filebeat spec file. Fixes
https://github.com/elastic/elastic-agent/issues/427.

* Update changelog.

* [Automation] Update elastic stack version to 8.4.0-3cd57abb for testing (#724)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-a324b98b for testing (#727)

Co-authored-by: apmmachine <[email protected]>

* ci: run on MacOS12 (#696)

* [Automation] Update elastic stack version to 8.4.0-31315ca3 for testing (#732)

Co-authored-by: apmmachine <[email protected]>

* fix typo on package command (#734)

This commit fixes the typo in the package command on the README.md.

* Allow / to be used in variable names (#718)

* Allow the / character to be used in variable names.

Allow / to be used in variable names from dynamic providers and eql
expressions. Ensure that k8s providers can provide variables with
slashes in their names.

* run antlr4

* Fix tests

* Fix Elastic Agent non-fleet broken upgrade between 8.3.x releases (#701)

* Fix Elastic Agent non-fleet broken upgrade between 8.3.x releases

* Migrates vault directory on linux and windows to the top directory of the
  agent, so it can be shared without needing the upgrade handler call,
  like for example with side-by-side install/upgrade from .rpm/.deb
* Extended vault to allow read-only open, useful when the vault at particular location needs to be only read not created.

* Correct the typo in the log messages

* Update lint flagged function comment with 'unused', was flagged with 'deadcode' on the previous run

* Address code review feedback

* Add missing import for linux utz

* Change vault path from Top() to Config(), this a better location, next to fleet.enc based on the install/upgrade testing with .rpm/.deb installs

* Fix the missing state migration for .rpm/.deb upgrade. The post install script now performs the migration and creates the symlink after that.

* Fix typo in the postinstall script

* Update the vault migration code, add the agent configuration match check with the agent secret

* [Automation] Update elastic stack version to 8.4.0-31269fd2 for testing (#746)

Co-authored-by: apmmachine <[email protected]>

* wrap errors and fix some docs typo and convention (#743)

* automate the ironbank docker context generation (#679)

* Update README.md

Adding M1 variable to export to be able to build AMD images

* fix flaky (#730)

* Add filestream ID on standalone kubernetes manifest (#742)

This commit add unique IDs for the filestream inputs used by the
Kubernetes integration in the Elastic-Agent standalone
Kubernetes configuration/manifest file.

* Alter github action to run on different OSs (#769)

Alter the linter action to run on different OSs instead of on linux with
the $GOOS env var.

* [Automation] Update elastic stack version to 8.4.0-d058e92f for testing (#771)

Co-authored-by: apmmachine <[email protected]>

* elastic-agent manifests: add comments; add cloudnative team as a codeowner for the k8s manifests (#708)

* managed elastic-agent: add comments; add cloudnative team as a codeowner for the k8s manifests

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add comments to the standalone elastic-agent, similar to the documentation we have https://www.elastic.co/guide/en/fleet/current/running-on-kubernetes-standalone.html

Signed-off-by: Tetiana Kravchenko <[email protected]>

* Apply suggestions from code review

Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>

* remove comment for FLEET_ENROLLMENT_TOKEN; use Needed everywhere instead of Required

Signed-off-by: Tetiana Kravchenko <[email protected]>

* rephrase regarding accessing kube-state-metrics when used third party tools, like kube-rbac-proxy

Signed-off-by: Tetiana Kravchenko <[email protected]>

* run make check

Signed-off-by: Tetiana Kravchenko <[email protected]>

* keep manifests in sync to pass ci check

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add info on where to find FLEET_URL and FLEET_ENROLLMENT_TOKEN

Signed-off-by: Tetiana Kravchenko <[email protected]>

* add links to elastic-agent documentation

Signed-off-by: Tetiana Kravchenko <[email protected]>

* update comment on FLEET_ENROLLMENT_TOKEN

Signed-off-by: Tetiana Kravchenko <[email protected]>

Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>

* [Elastic-Agent] Added source uri reloading (#686)

* Update will cleanup unneeded artifacts. (#752)

* Update will cleanup unneeded artifacts.

The update process will cleanup unneeded artifacts. When an update
starts all artifacts that do not have the current version number in it's
name will be removed. If artifact retrieval fails, downloaded artifacts
are removed. On a successful upgrade, all contents of the downloads dir
will be removed.

* Clean up linter warnings

* Wrap errors

* cleanup tests

* Fix passed version

* Use os.RemoveAll

* ci: propagate e2e-testing errors (#695)

* [Release] add-backport-next (#784)

* Update main to 8.5.0 (#793)

* [Automation] Update go release version to 1.17.12 (#726)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.4.0-60171339 for testing (#799)

Co-authored-by: apmmachine <[email protected]>

* update dependency elastic/go-structform from v0.0.9 to v0.0.10 (#802)

Signed-off-by: Florian Lehner <[email protected]>

* Fix unpacking of artifact config (#776)

Fix unpacking of artifact config (#776)

* [Automation] Update elastic stack version to 8.5.0-c54c3404 for testing (#826)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-7dbc10f8 for testing (#833)

Co-authored-by: apmmachine <[email protected]>

* Fix RPM/DEB clean install (#816)

* Fix RPM/DEB clean install

* Improve the post install script

* Do not try to copy the state files if the agent directory is the same,
  this causes the error.
* Check the existance of symlink instead of the file it is pointing to
  for the state file migration.

* Update check for symlink existance for the cases where the symlink points to non-existent file

* fix path for auto generated spec file (#859)

Signed-off-by: Florian Lehner <[email protected]>

* Reload downloader client on config change (#848)

Reload downloader client on config change (#848)

* Bundle elastic-agent.app for MacOS, needed to be able to enable the  … (#714)

* Bundle elastic-agent.app for MacOS, needed to be able to enable the  Full Disk Access

* Calm down the linter

* Fix pathing for windows unit test

* crossbuild: add fix to set ulimit for debian images (#856)

Signed-off-by: Florian Lehner <[email protected]>

* [Heartbeat] Cleanup docker install / always add playwright deps (#764)

This is the agent counterpart to elastic/beats#32122

Refactors Dockerfile handling of synthetics deps to rely on playwright install-deps rather than us manually keeping up to date with those. This should fix issues with newer playwrights needing additional deps.

This also cleans up the Dockerfile a good amount, and fixes indentation. Finally, this removes the unused Dockerfile.elastic-agent.tmpl file since agent is now its own repo. It also cleans up some other metadata that no longer does anything.

No changelog is specified because no user facing changes are present.

* [Automation] Update elastic stack version to 8.5.0-41aadc32 for testing (#889)

Co-authored-by: apmmachine <[email protected]>

* Fix/panic with composable renderer (#823)

* Fix a panic with wg passed to the composable object

In the code to retrieve the variables from the configuration files we
need to pass a execution callback, this callback will be called in a
goroutine. This callback can be executed multiple time until the
composable renderer is stopped. There were a problem in the code that
made the callback called multiple time and it made the waitgroup
internal counter to do to a negative values.

This commit change the behavior, it start the composable renderer give
it a callback when the callback receives the variables it will stop the
composable's Run method using the context.

This ensure that the callback will be called a single time and that the
variables are correctly retrieved.

Fixes: #806

* [Automation] Update go release version to 1.18.5 (#832)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-60a4c029 for testing (#899)

Co-authored-by: apmmachine <[email protected]>

* Add control-plane toleration to Agent K8S manifests. (#864)

* Add toleration to elastic-agent Kubernetes manifests.

The toleration with key node-role.kubernetes.io/control-plane is set to replace
the deprecated toleration with key node-role.kubernetes.io/master which will be
removed by Kubernetes v1.25

* Remove outdated "master" node terminology.

* install mage with go install (#936)

* Cloudnative ci automation (#837)

This commit provides the relevant Jenkins CI automation to open Pull requests to kibana github repository in order to keep Cloud-Native teams manifests in sync with the manifests that are used into Fleet UI.

For full information check #706

Updated .ci/Jenkins file that is triggered upon PR requests of /elastic-agent/deploy/kubernetes/* changes
Updated Makefile to add functionality needed to create the extra files for the new prs to kibana remote repository

* Reduce memory footprint by reordering struct elements (#804)

* Reduce memory footprint by reordering struct elements

* rename struct element for linter

Signed-off-by: Florian Lehner <[email protected]>

Signed-off-by: Florian Lehner <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-6b9f92c0 for testing (#948)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-0616acda for testing (#963)

Co-authored-by: apmmachine <[email protected]>

* Clarify that this repo is not only docs (#969)

* Add Filebeat lumberjack input to spec (#959)

Make the lumberjack input available from Agent.

Relates: https://github.com/elastic/beats/pull/32175

* [Automation] Update elastic stack version to 8.5.0-dd6f2bb0 for testing (#978)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-feb644de for testing (#988)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-7783a03c for testing (#1004)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-17b8a62d for testing (#1014)

Co-authored-by: apmmachine <[email protected]>

* update ironbank image product name (#1009)

This is required to automate the creation of the ironbank merge requests as the ubireleaser is using this field to compute the elastic-agent artifact url.

For example it is now trying to retrieve https://artifacts.elastic.co/downloads/beats/elastic-agent-8.4.0-linux-x86_64.tar.gz instead of https://artifacts.elastic.co/downloads/beats/elastic-agent/elastic-agent-8.4.0-linux-x86_64.tar.gz

* ci: add extended support for windows (#683)

* [Automation] Update elastic stack version to 8.5.0-9aed3b11 for testing (#1030)

Co-authored-by: apmmachine <[email protected]>

* Cloudnative ci utomation (#1035)

* Updating Jenkinsfile and Makefile to open PR

* Adding needed token-id

* [Automation] Update elastic stack version to 8.5.0-fedc3e60 for testing (#1054)

Co-authored-by: apmmachine <[email protected]>

* Testing PR creation for 706 (#1049)

* Fix lookup issues with inputs.d fragment yml (#840)

* Fix lookup issues with inputs.d fragment yml

The Elastic Agent was looking next to the binary for the `inputs.d`
folder instead it should look up into the `Home` folder where
the Elastic Agent symlink is located.

Fixes: #663

* Changelog

* Fix input.d path, tie to the agent Config() directory

* Update CHANGELOG to reflect that the agent configuration directory is used to locate the inputs.d directory

Co-authored-by: Aleksandr Maus <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-b5001a6d for testing (#1064)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-1bd77fc1 for testing (#1082)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-167dfc80 for testing (#1091)

Co-authored-by: apmmachine <[email protected]>

* Adding support for v1.25.0 k8s (#1044)

* Adding support for v1.25.0 k8s

* [Automation] Update elastic stack version to 8.5.0-6b7dda2d for testing (#1101)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.5.0-4140365c for testing (#1114)

Co-authored-by: apmmachine <[email protected]>

* Remove experimental warning log in upgrade command (#1106)

* Update go.mod to Go 1.18, update notice. (#1120)

* Remove the fleet reporter (#1130)

* Remove the fleet reporter

Remove the fleet-reporter so that checkins no longer deliver the event
list.

* add CHANGELOG fix tests

* [Automation] Update elastic stack version to 8.5.0-589a4a10 for testing (#1147)

Co-authored-by: apmmachine <[email protected]>

* Avoid reporting `Unhealthy` on fleet connectivity issues (#1152)

Avoid reporting `Unhealthy` on fleet connectivity issues (#1152)

* ci: enable MacOS M1 stages (#1123)

* [Automation] Update go release version to 1.18.6 (#1143)

* [Automation] Update elastic stack version to 8.5.0-37418cf3 for testing (#1165)

Co-authored-by: apmmachine <[email protected]>

* Remove mage notice in favour of make notice (#1108)

The current implementation of mage notice is not working because it
was never finalised, the fact that it and `make notice` exist only
generates confusion.

This commit removes the `mage notice` and documents that `make notice`
should be used instead for the time being.

In the long run we want to use the implementation on
`elastic-agent-libs`, however it is not working at the moment.

Closes #1107

Co-authored-by: Craig MacKenzie <[email protected]>

* ci: run e2e-testing at the end (#1169)

* ci: move macos to github actions (#1175)

* [Automation] Update elastic stack version to 8.5.0-fcf3d4c2 for testing (#1183)

Co-authored-by: apmmachine <[email protected]>

* Add support for hints' based autodiscovery in kubernetes provider (#698)

* ci: increase timeout (#1190)

* Fixing condition for PR creation (#1188)

* Fix leftover log level (#1194)

* [automation] Publish kubernetes templates for elastic-agent (#1192)

Co-authored-by: apmmachine <[email protected]>

* ci: force GO_VERSION (#1204)

* Fix whitespaces in vault_darwin.c (#1206)

* Update kubernetes templates for elastic-agent [templates.d] (#1231)

* Use at least warning level for all status logs (#1218)

* Update k8s manifests to leverage hints (#1202)

* Add Go 1.18 upgrade to breaking changes section. (#1216)

* Add Go 1.18 upgrade to breaking changes section.

* Fix the PR number in the changelog.

* [Release] add-backport-next (#1254)

* Bump version to 8.6.0. (#1259)

* [Automation] Update elastic stack version to 8.5.0-7dc445a0 for testing (#1248)

Co-authored-by: apmmachine <[email protected]>

* Fix: Endpoint collision between monitoring and regular beats  (#1034)

Fix: Endpoint collision between monitoring and regular beats  (#1034)

* internal/pkg/agent/cmd: don't format error message with nil errors (#1240)

The failure conditions allow nil errors to result in an error being formatted,
when formatting due to a non-accepted HTTP status code and a nil error, omit the
error.

Co-authored-by: Craig MacKenzie <[email protected]>

* [Automation] Update elastic stack version to 8.6.0-21651da3 for testing (#1290)

Co-authored-by: apmmachine <[email protected]>

* Fixed: source uri reload for download/verify components (#1252)

Fixed: source uri reload for download/verify components (#1252)

* Expand status reporter/controller interfaces to allow local reporters (#1285)

* Expand status reporter/controller interfaces to allow local reporters

Add a local reporter map to the status controller. These reporters are
not used when updating status with fleet-server, they are only used to
gather local state information - specifically if the agent is degraded
because checkin with fleet-server has failed. This bypasses the bug that
was introduced with the liveness endpoint where the agent could checkin
(to fleet-server) with a degraded status because a previous checkin
failed. Local reporters are used to generate a separate status. This
status is used in the liveness endpoint.

* fix linter

* Improve logging for agent upgrades. (#1287)

* [Automation] Update elastic stack version to 8.6.0-326f84b0 for testing (#1318)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.6.0-df00693f for testing (#1334)

Co-authored-by: apmmachine <[email protected]>

* Add success log message after previous checkin failures (#1327)

* Fix status reporter initialization (#1341)

* [Automation] Update elastic stack version to 8.6.0-a2f4f140 for testing (#1362)

Co-authored-by: apmmachine <[email protected]>

* Added status message to CheckinRequest (#1369)

* Added status message to CheckinRequest

* added changelog

* updated test

* added omitempty

* Fix failures when using npipe monitoring endpoints (#1371)

* [Automation] Update elastic stack version to 8.6.0-158a13db for testing (#1379)

Co-authored-by: apmmachine <[email protected]>

* Mount /etc directory in Kubernetes DaemonSet manifests. (#1382)

Changes made to files like `/etc/passwd` using Linux tools like
`useradd` are not reflected in the mounted file on the Agent,
because the tool replaces the file instead of changing it
in-place.

Mounting the parent directory solves this problem.

* [Automation] Update elastic stack version to 8.6.0-aea1c645 for testing (#1405)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.6.0-0fca2953 for testing (#1412)

Co-authored-by: apmmachine <[email protected]>

* ci: 7.17 is not available for the daily run (#1417)

* [Automation] Update elastic stack version to 8.6.0-e4c15f15 for testing (#1425)

Co-authored-by: apmmachine <[email protected]>

* [backport main] Fix: Agent failed to upgrade from 8.4.2 to 8.5.0 BC1 for MAC 12 agent using agent binary. (#1401)

[backport main] Fix: Agent failed to upgrade from 8.4.2 to 8.5.0 BC1 for MAC 12 agent using agent binary. (#1401)

* Fix docker provider add_fields processors (#1420)

The Docker provider was using a wrong key when defining the
`add_fields` processor, this causes Filebeat not to start the input
and stay on a unhealthy state.

This commig fixes it.

Fixes https://github.com/elastic/beats/issues/29030

* [Automation] Update elastic stack version to 8.6.0-d939cfde for testing (#1436)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.6.0-7c9f25a9 for testing (#1446)

Co-authored-by: apmmachine <[email protected]>

* Enable integration only when datastreams are not defined (#1456)

* Add not dedoted k8s pod labels in autodiscover provider to be used for templating, exactly like annotations (#1398)

* [Automation] Update elastic stack version to 8.6.0-c49fac70 for testing (#1464)

Co-authored-by: apmmachine <[email protected]>

* Add storageclass permissions in agent clusterrole (#1470)

* Add storageclass permissions in agent clusterrole

* Remote QA-labels automation (#1455)

* [Automation] Update go release version to 1.18.7 (#1444)

Co-authored-by: apmmachine <[email protected]>

* [Automation] Update elastic stack version to 8.6.0-5a8d757d for testing (#1480)

Co-authored-by: apmmachine <[email protected]>

* Improve logging around agent checkins. (#1477)

Improve logging around agent checkins.

- Log transient checkin errors at Info.
- Upgrade to an Error log after 2 repeated failures.
- Log the wait time for the next retry.
- Only update local state after repeated failures.

* [Automation] Update elastic stack version to 8.6.0-40086bc7 for testing (#1496)

Co-authored-by: apmmachine <[email protected]>

* Fixing makefile check (#1490)

* Fixing makefile check

* action: validate changelog fragment (#1488)

* Allign managed with standalone role (#1500)

* Fix k8s template link versioning (#1504)

* Allighningmanifests (#1507)

* Allign managed with standalone role

* Fixing missing Label

* [Automation] Update elastic stack version to 8.6.0-233dc5d4 for testing (#1515)

Co-authored-by: apmmachine <[email protected]>

* Convert CHANGELOG.next to fragments (#1244)

* [Automation] Update elastic stack version to 8.6.0-54a302f0 for testing (#1531)

Co-authored-by: apmmachine <[email protected]>

* Update the linter configuration. (#1478)

Sync the configuration with the one used in Beats, which has disabled
the majority of the least useful linters already.

* Elastic agent counterpart of https://github.com/elastic/beats/pull/33362 (#1528)

Always use the stack_release label for npm i

No changelog necessary since there are no user-visible changes

This lets us ensure we've carefully reviewed and labeled the version of the @elastic/synthetics NPM library that's bundled in docker images

* [Automation] Update elastic stack version to 8.6.0-cae815eb for testing (#1545)

Co-authored-by: apmmachine <[email protected]>

* Fix admin permission check on localized windows (#1552)

Fix admin permission check on localized windows (#1552)

* Fixes from merge of main.

* Update heartbeat specification to only support elasticsearch.

* Fix bad merge in dockerfile.

Signed-off-by: Florian Lehner <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: Pier-Hugues Pellerin <[email protected]>
Co-authored-by: Denis Rechkunov <[email protected]>
Co-authored-by: Michel Laterman <[email protected]>
Co-authored-by: Aleksandr Maus <[email protected]>
Co-authored-by: Victor Martinez <[email protected]>
Co-authored-by: Manuel de la Peña <[email protected]>
Co-authored-by: Anderson Queiroz <[email protected]>
Co-authored-by: Daniel Araujo Almeida <[email protected]>
Co-authored-by: Mariana Dima <[email protected]>
Co-authored-by: ofiriro3 <[email protected]>
Co-authored-by: Julien Lind <[email protected]>
Co-authored-by: Craig MacKenzie <[email protected]>
Co-authored-by: Tiago Queiroz <[email protected]>
Co-authored-by: Pierre HILBERT <[email protected]>
Co-authored-by: Tetiana Kravchenko <[email protected]>
Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>
Co-authored-by: Michal Pristas <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>
Co-authored-by: Florian Lehner <[email protected]>
Co-authored-by: Andrew Cholakian <[email protected]>
Co-authored-by: Yash Tewari <[email protected]>
Co-authored-by: Quentin Pradet <[email protected]>
Co-authored-by: Andrew Kroh <[email protected]>
Co-authored-by: Julien Mailleret <[email protected]>
Co-authored-by: Josh Dover <[email protected]>
Co-authored-by: Chris Mark <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: Dan Kortschak <[email protected]>
Co-authored-by: Julia Bardi <[email protected]>
Co-authored-by: Edoardo Tenani <[email protected]>

* Add input name alias for cloudbeat integrations (#1596)

* add name alias for cloudbeat

* add anchors for yaml fields

* add EKS input

* Change the stater to include a local flag. (#1308)

* Change the stater to include a local flag.

Change the state reporter to use a local flag that determines if local
errors are included in the resulting state. Assume that configMgr errors
are all local - this effects mainly the fleet_gateway. Allow the gateway
to report an error if a checkin fails. When a checkin fails the local
state reported through the status command and liveness endpoint will
include the error, but checkins to fleet-server will not.

* Add ActionsError() method to config manager

Add a new ActionsError() methdo the the config managers. For the
non-managed instances it will return a nil channel. For the managed
instances it will return the dispatcher error queue directly. Have teh
coordinator gather from this channel as it does for the others and
treat any errors as non-local.

* Fix linter

* Service runtime V2 (#1529)

* Service V2 runtime

* Implements service runtime component for V2.
* Extends endpoint spec with some additional attributes for service start/stop/status checks and creds discovery. The creds discovery logic is taken from V1, cleaned up and extracted into its own file, added utz.
* Implements service uninstall
* Refactors pkg/core/process/process.go adds additional options that are needed for the service_command implementation.
* Changes ComponentsModifier to access raw config, needed for the EndpointComponentModifier
* Injects host.id into configuration, needed for Endpoint
* Injects fleet and policy.revision configuration into the Endpoint input configuration
* Bumps the version to 8.6.0 to make it consistent with current beats V2 branch
* Addresses linter complains on affected files

* Remove the service watcher, all the start/stopping logic

* Add changelog

* Fix typo

* Send STOPPING only upon teardown

* Wait for check-in with timeout before sending stopping on teardown

* Fix the service loop routine blocking on channel after stopped

* Addressed code review comments

* Make linter happy

* Try to fix make check-ci

* Spellcheck runtime README.md

* Remove .Stop timeout from the spec as it is no longer used

* Addressed code review feedback

* Sync components with state during container start (#1653)

* Sync components with state during container start

* path approach

* Subprocess reader start.

* Implement io.Writer to handle reading stdout/stderr for spawned components.

* Don't inject logging args to beats components. Always have beats log to stderr.

* Update to v0.2.15 of elastic-agent-libs.

* [V2] Enable support for shippers (#1527)

* Work on adding shipper support.

* Fix fmt.

* Fix reference to spec. Allow shipper to be null but still enabled if key exists.

* Move supported shippers into its own key in the input specification.

* Fix issue in merge.

* Implement fake shipper and add fake shipper output to the fake component.

* Add protoc to the test target.

* Don't generate fake shipper protocol in test.

* Commit fake GRPC into code.

* Add unit test for running with shipper, with sending event between running componentn and running shipper.

* Add docstring for shipper test.

* Add changelog fragement.

* Adjust paths for shipper to work on windows and better on unix.

* Update changelog/fragments/1667571017-Add-support-for-running-the-elastic-agent-shipper.yaml

Co-authored-by: Craig MacKenzie <[email protected]>

* Fix fake/component to connect over npipe on windows.

Co-authored-by: Craig MacKenzie <[email protected]>

* More work on the logging.

* More fixes.

* Change back to streams.

* Fix go.mod.

* Fix import.

* Fix issues with merge of main.

* remove log helper.

* Add NewWithoutConfig.

* Fix the spawned filestream to ingest logs into elasticsearch for monitoring.

* Add changelog entry.

* Remove debug print.

* Update 1669236059-Capture-stdout-stderr-of-all-spawned-components-to-simplify-logging.yaml

Signed-off-by: Florian Lehner <[email protected]>
Co-authored-by: Michal Pristas <[email protected]>
Co-authored-by: Aleksandr Maus <[email protected]>
Co-authored-by: Michel Laterman <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: Pier-Hugues Pellerin <[email protected]>
Co-authored-by: Denis Rechkunov <[email protected]>
Co-authored-by: Victor Martinez <[email protected]>
Co-authored-by: Manuel de la Peña <[email protected]>
Co-authored-by: Anderson Queiroz <[email protected]>
Co-authored-by: Daniel Araujo Almeida <[email protected]>
Co-authored-by: Mariana Dima <[email protected]>
Co-authored-by: ofiriro3 <[email protected]>
Co-authored-by: Julien Lind <[email protected]>
Co-authored-by: Craig MacKenzie <[email protected]>
Co-authored-by: Tiago Queiroz <[email protected]>
Co-authored-by: Pierre HILBERT <[email protected]>
Co-authored-by: Tetiana Kravchenko <[email protected]>
Co-authored-by: Michael Katsoulis <[email protected]>
Co-authored-by: Andrew Gizas <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Anderson Queiroz <[email protected]>
Co-authored-by: Florian Lehner <[email protected]>
Co-authored-by: Andrew Cholakian <[email protected]>
Co-authored-by: Yash Tewari <[email protected]>
Co-authored-by: Quentin Pradet <[email protected]>
Co-authored-by: Andrew Kroh <[email protected]>
Co-authored-by: Julien Mailleret <[email protected]>
Co-authored-by: Josh Dover <[email protected]>
Co-authored-by: Chris Mark <[email protected]>
Co-authored-by: apmmachine <[email protected]>
Co-authored-by: Dan Kortschak <[email protected]>
Co-authored-by: Julia Bardi <[email protected]>
Co-authored-by: Edoardo Tenani <[email protected]>
Co-authored-by: Alex K <[email protected]>
(cherry picked from commit 7a748fa0fdf4ab786583c7a38169d099b58a7c02)

Co-authored-by: Blake Rouse <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants