Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load Filebeat modules pipelines on -setup #3394

Merged
merged 2 commits into from
Jan 20, 2017

Conversation

tsg
Copy link
Contributor

@tsg tsg commented Jan 17, 2017

This adds the -setup CLI flag, which, for now, makes Filebeat load the pipelines at startup. In case Elasticsearch is not available when Filebeat is started with -setup, Filebeat will exit with an error.

This is the first version that passes the modules tests without the help of filebeat.py 🎉 .

This also exposes an Elasticsearch client from the output.

A few more things to note/discuss:

  • For now I added -setup to Filebeat only. As we add support to loading the dashboards as well, we'll probably want to move the flag in libbeat and make it available for all Beats.
  • The current behavior is that -setup exits on errors but continues with a normal Filebeat run in case of success. The idea is that you would use -setup to make the initial setup but also to test it. After that is successful, you would typically stop Filebeat and start it in the background via the init scripts (so without -setup).
  • The above means that if Elastisearch is not available when calling filebeat -setup, the user gets an error directly, it doesn't "wait" for ES.
  • I tried separating the client but resulted in a too heavy change so I bailed out. I added a function to get a client with the same configuration. This works fine but it's a bit hacky.
  • the elasticsearch output Connection.request is made public as Connection.Request. This goes in the direction of separating the client. I only needed it in tests, though, so I can take a different approach if needed.
  • If more ES hosts are defined, -setup will iterate through them and use the first one that responds to an HTTP ping.

Part of #3159.

@tsg tsg added the review label Jan 17, 2017
@tsg tsg mentioned this pull request Jan 17, 2017
22 tasks
@tsg
Copy link
Contributor Author

tsg commented Jan 17, 2017

See also the discussion here: #3159 (comment)

@tsg tsg added the in progress Pull request is currently in progress. label Jan 17, 2017
@tsg tsg force-pushed the load_pipeline_on_setup branch from 3cd9e60 to bb51843 Compare January 17, 2017 14:24
@tsg tsg removed the in progress Pull request is currently in progress. label Jan 17, 2017
Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts on -setup flag:

Running filebeat with -setup is probably a manual process and is rarely automated. This would lead me more to the conclusion it should not continue after having the setup completed.

On the other hand, for a getting started experience it would be great if it continues. Also assuming it is automated and there is one server that always starts with -setup, it would mean updating to most recent dashboards or others could be done and this server could still harvest directly log files, without needing a restart.

A downside of having it running directly is that when stopping filebeat after -setup it could lead to a few duplicated events potentially, when no shutdown_timeout is used.

So kind of both options are not optimal. What if we would have an additional flag? We could reuse -once here that in case it is set, it will only run the setup. Or have a new flag with a better name that indicates it should only do the setup (-setup-only)?

@@ -19,6 +20,7 @@ import (
)

var once = flag.Bool("once", false, "Run filebeat only once until all harvesters reach EOF")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

group together under var( ?

@tsg tsg force-pushed the load_pipeline_on_setup branch from bb51843 to eae83d1 Compare January 17, 2017 22:20
@@ -63,11 +67,33 @@ func New(b *beat.Beat, rawConfig *common.Config) (beat.Beater, error) {
return fb, nil
}

// Setup is called on user request (the -setup flag) to do the initial Beat setup.
func (fb *Filebeat) Setup(b *beat.Beat) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't we have a Setup step in the 1.0 beats interface? Old days :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah :). I didn't add it yet into the interface but I have a feeling that's where it will eventually get there.

]
# print(" ".join(cmd) + "\n")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be removed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced the print with outputting to the log. Having the command already built in the logs is fantastic for troubleshooting :)

@monicasarbu
Copy link
Contributor

The main reason to add the -setup option was to be able to reduce the number of configuration steps to one, in order to avoid running the import_dashboards in addition. For now, I would say it's ok if we need to run a Beat with the -setup option to import the dashboards, pipelines before starting to harvest log files, but later it would be nice if any of the Beat could do the "setup" if the dashboards & pipelines are not imported. +1 of -setup option to do the setup and run Filebeat.

@@ -240,3 +241,29 @@ func (reg *ModuleRegistry) GetProspectorConfigs() ([]*common.Config, error) {
}
return result, nil
}

func (reg *ModuleRegistry) Setup(esClient *elasticsearch.Client) error {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need the full elasticsearch.Client here? If we define an interface with methods required, it get's us an idea, well, what's really required for ModuleRegistry + we can easily mock the client in tests.

Tudor Golubenco added 2 commits January 20, 2017 13:41
This adds the `-setup` CLI flag, which, for now, makes Filebeat load the
pipelines at startup. In case Elasticsearch is not available when
Filebeat is started with `-setup`, Filebeat will exit with an error.

This also exposes an Elasticsearch client from the output.
@tsg tsg force-pushed the load_pipeline_on_setup branch from 1a83b00 to 920388a Compare January 20, 2017 12:43
@andrewkroh andrewkroh merged commit c6c5e5c into elastic:master Jan 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants