Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decode_json_field: process objects and arrays only #11312

Merged
merged 12 commits into from
Mar 26, 2019
1 change: 1 addition & 0 deletions CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,7 @@ https://github.com/elastic/beats/compare/v7.0.0-alpha2...master[Check the HEAD d
- Include ip and boolean type when generating index pattern. {pull}10995[10995]
- Cancelling enrollment of a beat will not enroll the beat. {issue}10150[10150]
- Add missing fields and test cases for libbeat add_kubernetes_metadata processor. {issue}11133[11133], {pull}11134[11134]
- decode_json_field: process objects and arrays only {pull}11312[11312]
- Report faulting file when config reload fails. {pull}[11304]11304

*Auditbeat*
Expand Down
4 changes: 4 additions & 0 deletions libbeat/processors/actions/decode_json_fields.go
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,10 @@ func unmarshal(maxDepth int, text string, fields *interface{}, processArray bool
return v, false
}

if !strings.HasPrefix(str, "[") && !strings.HasPrefix(str, "{") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You check here for [ but couldn't find an example of it the tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a bug with arrays solved in PR #11318 which also includes tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it, the mentioned bug is about processing when it should not process, so it should be fine to test it with processing enabled

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a quite simple heuristic, but should be fine. It still can fail, though. But decodeJSON should have failed in the first place. For example given the string "2016-20-10" the lexer should find that - is an invalid character. The dec.More() call only returns false (as there is no more valid data in the stream), but More also returns false if there was an error. We can improve decodeJSON by checking the dec.Token() error code after dec.More().

For the heuristics (we can still guard decodeJSON using some heuristics), we should remove whitespace when testing as well:

func isStructured(s string) bool {
  s = strings.TrimSpace(s)
  end = len(s) - 1
  return end > 0 && (
    (s[0] == '[' && s[end] == ']') ||
    (s[0] == '{' && s[end] == '}')
  )
}

return str, false
}

var tmp interface{}
err := unmarshal(maxDepth, str, &tmp, processArray)
if err != nil {
Expand Down
63 changes: 63 additions & 0 deletions libbeat/processors/actions/decode_json_fields_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
package actions

import (
"fmt"
"testing"

"github.com/stretchr/testify/assert"
Expand Down Expand Up @@ -199,6 +200,68 @@ func TestTargetRootOption(t *testing.T) {
assert.Equal(t, expected.String(), actual.String())
}

func TestNotJsonObjectOrArray(t *testing.T) {

var cases = []struct {
MaxDepth int
Expected common.MapStr
}{
{
MaxDepth: 1,
Expected: common.MapStr{
"msg": common.MapStr{
"someDate": "2016-09-28T01:40:26.760+0000",
"someNumber": 1475026826760,
"someNumberAsString": "1475026826760",
"someString": "foobar",
"someString2": "2017 is awesome",
"someMap": "{\"a\":\"b\"}",
"someArray": "[1,2,3]",
},
},
},
{
MaxDepth: 10,
Expected: common.MapStr{
"msg": common.MapStr{
"someDate": "2016-09-28T01:40:26.760+0000",
"someNumber": 1475026826760,
"someNumberAsString": "1475026826760",
"someString": "foobar",
"someString2": "2017 is awesome",
"someMap": common.MapStr{"a": "b"},
"someArray": []int{1, 2, 3},
},
},
},
}

for _, testCase := range cases {
t.Run(fmt.Sprintf("TestNotJsonObjectOrArrayDepth-%v", testCase.MaxDepth), func(t *testing.T) {
input := common.MapStr{
"msg": `{
"someDate": "2016-09-28T01:40:26.760+0000",
"someNumberAsString": "1475026826760",
"someNumber": 1475026826760,
"someString": "foobar",
"someString2": "2017 is awesome",
"someMap": "{\"a\":\"b\"}",
"someArray": "[1,2,3]"
}`,
}

testConfig, _ = common.NewConfigFrom(map[string]interface{}{
"fields": fields,
"process_array": true,
"max_depth": testCase.MaxDepth,
})

actual := getActualValue(t, testConfig, input)
assert.Equal(t, testCase.Expected.String(), actual.String())
})
}
}

func getActualValue(t *testing.T, config *common.Config, input common.MapStr) common.MapStr {
logp.TestingSetup()

Expand Down