-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix missing support for setting document id in decoder_json pr… #15859
Conversation
Update processors, output, and json parser to store the document ID in `@metadata._id`. This ensures better compatibility with Logstash inputs/filters setting `@metadata._id`. Also add missing `document_id` to decode_json_fields processor, given users the chance to set the document id if the JSON document was embedded in another JSON document.
@@ -51,7 +51,7 @@ func (e *Event) SetID(id string) { | |||
if e.Meta == nil { | |||
e.Meta = common.MapStr{} | |||
} | |||
e.Meta["id"] = id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the special nature of this field name and the desire to keep it consistent in multiple places, do you think we should make it an exported const?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have more than one field that is special to meta. Let's clean these up (the other fields as well) in a follow up PR.
Should we add a CHANGELOG entry, maybe especially since it's technically a breaking change? |
Oops, Addded changelog. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
beats-ci failure due to timeouts downloading dependencies. All related tests passed on Travis. Merging. |
…tic#15859) * Change to metadata._id Update processors, output, and json parser to store the document ID in `@metadata._id`. This ensures better compatibility with Logstash inputs/filters setting `@metadata._id`. Also add missing `document_id` to decode_json_fields processor, given users the chance to set the document id if the JSON document was embedded in another JSON document. (cherry picked from commit d60b04a)
…tic#15859) * Change to metadata._id Update processors, output, and json parser to store the document ID in `@metadata._id`. This ensures better compatibility with Logstash inputs/filters setting `@metadata._id`. Also add missing `document_id` to decode_json_fields processor, given users the chance to set the document id if the JSON document was embedded in another JSON document. (cherry picked from commit d60b04a)
* Change to metadata._id Update processors, output, and json parser to store the document ID in `@metadata._id`. This ensures better compatibility with Logstash inputs/filters setting `@metadata._id`. Also add missing `document_id` to decode_json_fields processor, given users the chance to set the document id if the JSON document was embedded in another JSON document. (cherry picked from commit d60b04a)
* Change to metadata._id Update processors, output, and json parser to store the document ID in `@metadata._id`. This ensures better compatibility with Logstash inputs/filters setting `@metadata._id`. Also add missing `document_id` to decode_json_fields processor, given users the chance to set the document id if the JSON document was embedded in another JSON document. (cherry picked from commit d60b04a)
Testing turned up an oversight in this PR: |
In elastic#15859 the Elasticsearch output was changed to read from the @metadata._id field when it had been using @metadata.id. The s3 and googlepubsub inputs had both been setting @metadata.id, but were not updated with that change. This updates the s3 and googlepubsub inputs to use `beat.Event#SetID()` rather than creating the metadata object themselves.
In #15859 the Elasticsearch output was changed to read from the @metadata._id field when it had been using @metadata.id. The s3 and googlepubsub inputs had both been setting @metadata.id, but were not updated with that change. This updates the s3 and googlepubsub inputs to use `beat.Event#SetID()` rather than creating the metadata object themselves.
In elastic#15859 the Elasticsearch output was changed to read from the @metadata._id field when it had been using @metadata.id. The s3 and googlepubsub inputs had both been setting @metadata.id, but were not updated with that change. This updates the s3 and googlepubsub inputs to use `beat.Event#SetID()` rather than creating the metadata object themselves. (cherry picked from commit 304eca4)
In #15859 the Elasticsearch output was changed to read from the @metadata._id field when it had been using @metadata.id. The s3 and googlepubsub inputs had both been setting @metadata.id, but were not updated with that change. This updates the s3 and googlepubsub inputs to use `beat.Event#SetID()` rather than creating the metadata object themselves. (cherry picked from commit 304eca4)
What does this PR do?
Update processors, output, and json parser to store the document ID in
@metadata._id
.Also add missing
document_id
to decode_json_fields processor, givenusers the chance to set the document id if the JSON document was
embedded in another JSON document.
Why is it important?
@metadata._id
.About the breaking change: The
document_id
setting on the JSON decoder has been introduced in 7.5, but overall effort on supporting event duplication was only finalized in 7.6. This means that the to@metadata._id
is a breaking change. But the feature wasn't much documented, while actual documentation on how to configure beats + ES for data duplication is planned for 7.6.Checklist
Author's Checklist
How to test this PR locally
{"myid": "id1", "log": "..."}
-d 'publish'
and check that@metadata._id
is set when inspecting events to be published in the logs. Themyid
should be removed from the event._id
matches the original contents ofmyid
.Related issues