Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decode_json_field: process objects and arrays only #11312

Merged
merged 12 commits into from
Mar 26, 2019
3 changes: 2 additions & 1 deletion CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ https://github.com/elastic/beats/compare/v7.0.0-alpha2...master[Check the HEAD d
info from the cloud.machine.type and cloud.availability_zone. {issue}10968[10968]
- Empty `meta.json` file will be treated as a missing meta file. {issue}8558[8558]
- Rename `migration.enabled` config to `migration.6_to_7.enabled`. {pull}11284[11284]
- decode_json_field: process objects and arrays only {pull}11312[11312]
- Beats Xpack now checks for Basic license on connect. {pull}11296[11296]

*Auditbeat*
Expand Down Expand Up @@ -174,7 +175,7 @@ https://github.com/elastic/beats/compare/v7.0.0-alpha2...master[Check the HEAD d
- Add on_failure handler for Zeek ingest pipelines. Fix one field name error for notice and add an additional test case. {issue}11004[11004] {pull}11105[11105]
- Fix goroutine leak happening when harvesters are dynamically stopped. {pull}11263[11263]
- Fix issue preventing docker container events to be stored if the container has a network interface without ip address. {issue}11225[11225] {pull}11247[11247]
- Add on_failure handler for Zeek ingest pipelines. Fix one field name error for notice and add an additional test
- Add on_failure handler for Zeek ingest pipelines. Fix one field name error for notice and add an additional test
case. {issue}11004[11004] {pull}11105[11105]
- Change URLPATH grok pattern to support brackets. {issue}11135[11135] {pull}11252[11252]
- Add support for iis log with different address format. {issue}11255[11255] {pull}11256[11256]
Expand Down
4 changes: 4 additions & 0 deletions libbeat/processors/actions/decode_json_fields.go
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,10 @@ func unmarshal(maxDepth int, text string, fields *interface{}, processArray bool
return v, false
}

if !strings.HasPrefix(str, "[") && !strings.HasPrefix(str, "{") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You check here for [ but couldn't find an example of it the tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a bug with arrays solved in PR #11318 which also includes tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it, the mentioned bug is about processing when it should not process, so it should be fine to test it with processing enabled

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a quite simple heuristic, but should be fine. It still can fail, though. But decodeJSON should have failed in the first place. For example given the string "2016-20-10" the lexer should find that - is an invalid character. The dec.More() call only returns false (as there is no more valid data in the stream), but More also returns false if there was an error. We can improve decodeJSON by checking the dec.Token() error code after dec.More().

For the heuristics (we can still guard decodeJSON using some heuristics), we should remove whitespace when testing as well:

func isStructured(s string) bool {
  s = strings.TrimSpace(s)
  end = len(s) - 1
  return end > 0 && (
    (s[0] == '[' && s[end] == ']') ||
    (s[0] == '{' && s[end] == '}')
  )
}

return str, false
}

var tmp interface{}
err := unmarshal(maxDepth, str, &tmp, processArray)
if err != nil {
Expand Down
63 changes: 63 additions & 0 deletions libbeat/processors/actions/decode_json_fields_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
package actions

import (
"fmt"
"testing"

"github.com/stretchr/testify/assert"
Expand Down Expand Up @@ -199,6 +200,68 @@ func TestTargetRootOption(t *testing.T) {
assert.Equal(t, expected.String(), actual.String())
}

func TestNotJsonObjectOrArray(t *testing.T) {

var cases = []struct {
MaxDepth int
Expected common.MapStr
}{
{
MaxDepth: 1,
Expected: common.MapStr{
"msg": common.MapStr{
"someDate": "2016-09-28T01:40:26.760+0000",
"someNumber": 1475026826760,
"someNumberAsString": "1475026826760",
"someString": "foobar",
"someString2": "2017 is awesome",
"someMap": "{\"a\":\"b\"}",
"someArray": "[1,2,3]",
},
},
},
{
MaxDepth: 10,
Expected: common.MapStr{
"msg": common.MapStr{
"someDate": "2016-09-28T01:40:26.760+0000",
"someNumber": 1475026826760,
"someNumberAsString": "1475026826760",
"someString": "foobar",
"someString2": "2017 is awesome",
"someMap": common.MapStr{"a": "b"},
"someArray": []int{1, 2, 3},
},
},
},
}

for _, testCase := range cases {
t.Run(fmt.Sprintf("TestNotJsonObjectOrArrayDepth-%v", testCase.MaxDepth), func(t *testing.T) {
input := common.MapStr{
"msg": `{
"someDate": "2016-09-28T01:40:26.760+0000",
"someNumberAsString": "1475026826760",
"someNumber": 1475026826760,
"someString": "foobar",
"someString2": "2017 is awesome",
"someMap": "{\"a\":\"b\"}",
"someArray": "[1,2,3]"
}`,
}

testConfig, _ = common.NewConfigFrom(map[string]interface{}{
"fields": fields,
"process_array": true,
"max_depth": testCase.MaxDepth,
})

actual := getActualValue(t, testConfig, input)
assert.Equal(t, testCase.Expected.String(), actual.String())
})
}
}

func getActualValue(t *testing.T, config *common.Config, input common.MapStr) common.MapStr {
logp.TestingSetup()

Expand Down