Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor input.Event similar to outputs.Data #3823

Merged
merged 5 commits into from
Mar 30, 2017

Conversation

vjsamuel
Copy link
Contributor

@vjsamuel vjsamuel commented Mar 28, 2017

Kindly review this change as this is a feature that we're looking for when we use filebeat :) If the code looks good, I can also fix the test cases.

#3785

cc: @ruflin

@vjsamuel vjsamuel force-pushed the prospector_level_filters branch from 9f0a1dd to f30822b Compare March 28, 2017 08:22
@elasticmachine
Copy link
Collaborator

Jenkins standing by to test this. If you aren't a maintainer, you can ignore this comment. Someone with commit access, please review this and clear it for Jenkins to run.

1 similar comment
@elasticmachine
Copy link
Collaborator

Jenkins standing by to test this. If you aren't a maintainer, you can ignore this comment. Someone with commit access, please review this and clear it for Jenkins to run.

filebeat/f.yml Outdated
@@ -0,0 +1,143 @@
###################### Filebeat Configuration Example #########################
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that was probably commited by accident. You can use filebeat.dev.yml and it will be ignored automatically by gitignore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed it. thanks for the heads up. i ll use that.

@vjsamuel vjsamuel force-pushed the prospector_level_filters branch from f30822b to d63d93b Compare March 28, 2017 08:25
Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I like the change and I think it goes in the right direction.

One thing I worry is that we introduce now fields in the common.MapStr which could conflict with fields that are already in there.

The other part I have to check in more detail is if all the state updates still work as expected. Based on the code I get the impression that states are not always updated.

@@ -43,6 +44,7 @@ type prospectorConfig struct {
Pipeline string `config:"pipeline"`
Module string `config:"_module_name"` // hidden option to set the module name
Fileset string `config:"_fileset_name"` // hidden option to set the fileset name
Filters processors.PluginConfig `config:"filters"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably we should call it processors, as all of them will be available. It's already causing confusion on the metricbeat side ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ha ha! let me do that.


//run the filters before sending to
eventMap = p.filters.Run(eventMap)
if eventMap != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming the event was "removed" by the filter, this would probably mean no state update is sent anymore? Or when exactly is eventMap empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eventMap would be empty in case of drop_event. But i did see that the file was traversed till the end:

2017/03/28 08:17:17.442705 log.go:114: INFO Reader was closed: /var/log/alf.log. Closing.
2017/03/28 08:17:17.443151 log.go:306: DBG  Stopping harvester for file: /var/log/alf.log
2017/03/28 08:17:17.443163 log.go:315: DBG  Closing file: /var/log/alf.log
2017/03/28 08:17:17.443169 log.go:183: DBG  Update state: /var/log/alf.log, offset: 0
2017/03/28 08:17:17.443130 log.go:183: DBG  Update state: /var/log/displaypolicyd.stdout.log, offset: 1155
2017/03/28 08:17:17.443030 log.go:315: DBG  Closing file: /var/log/jamf.log
2017/03/28 08:17:17.443198 log.go:183: DBG  Update state: /var/log/jamf.log, offset: 1190762

But since these events do goto registrar, the registry file wouldn't be updated. But this is a scenario where I configure my prospector with:

filters:
  drop_event:

In this case, no event would pass upto registrar.

@ruflin ruflin added Filebeat Filebeat in progress Pull request is currently in progress. labels Mar 28, 2017
@vjsamuel vjsamuel changed the title Prospector level filters Prospector level processors Mar 29, 2017
@vjsamuel vjsamuel force-pushed the prospector_level_filters branch from d98b53f to c78db55 Compare March 29, 2017 06:43
@ruflin
Copy link
Contributor

ruflin commented Mar 29, 2017

jenkins, test it

@ruflin ruflin added review and removed in progress Pull request is currently in progress. labels Mar 29, 2017
Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the contribution. I had again a closer look and left some minor comments.

For the PR I suggest to split it up in 2 parts:

  • PR 1: All changes except the few lines needed to add the processor. This is pure refactoring / setting the foundation for the processor
  • PR 2: Add processors with changelog entry and system tests. We should add the tests directly with the same PR to make sure the new feature we introduce works as expected. This also makes backporting it easier.

We can add the docs for it in a third step.

@urso Would be nice to get your feedback on this one too.

State file.State
}

type EventHolder struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename EventHolder to Data to be consistent with https://github.com/elastic/beats/blob/master/libbeat/outputs/outputs.go#L20 ? I'm ok with keeping Metadata as here it is at the moment only about Metadata

Pipeline string
Fileset string
Module string
InputType string
DocumentType string
common.EventMetadata
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets move the common.EventMetadata to the the first line of EventMeta

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

JSONConfig *reader.JSONConfig
State file.State
Data common.MapStr // Use in readers to add data to the event
EventMeta
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So currently we have EventMeta twice, means two copies of the data? What would be the affect if we not have it in the Event? I think we will need in the future processors to have access to the meta data too, but do we need this in the first step?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Event will be used in harvesters and when it reaches the processor, we do a event.GetEventHolder(), which would (you are right), copy the metadata from Event to EventHolder and generate the MapStr. But after that since Event object goes out of scope, it would be garbage collected if I am not wrong. Would that be an issue? Today when we generate the common.MapStr, we copy all the log related fields over. Now with this change, we additionally copy over the meta fields as well. Please correct me if I'm wrong.

var ok bool
if eventHolder.Event != nil {
//processor might decide to drop the event
ok = p.outlet.OnEvent(&eventHolder)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets move the ok = p.outlet.OnEvent(&eventHolder) after the if / else and only have if eventHolder.Event == nil { Bytes = 0} . This should be the same logic. Then we also don't need the definition of ok.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will add this back in the next PR. removed it for now.

@vjsamuel vjsamuel force-pushed the prospector_level_filters branch from ca55dc4 to 3a620e6 Compare March 29, 2017 14:44
@vjsamuel
Copy link
Contributor Author

@ruflin, i have removed the processor code and renamed the object from EventHolder to Data.

Also do I submit the PR for processor code immediately or do I wait for this to be approved and merged?

@vjsamuel vjsamuel changed the title Prospector level processors Refactor input.Event similar to outputs.Data Mar 29, 2017
@ruflin
Copy link
Contributor

ruflin commented Mar 30, 2017

jenkins, test it

@ruflin ruflin merged commit 1c2d335 into elastic:master Mar 30, 2017
@vjsamuel
Copy link
Contributor Author

thank you! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants