Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial key=value parser #426

Merged
merged 3 commits into from
Sep 20, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions cmd/stanza/init_common.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ import (

_ "github.com/observiq/stanza/operator/builtin/parser/csv"
_ "github.com/observiq/stanza/operator/builtin/parser/json"
_ "github.com/observiq/stanza/operator/builtin/parser/keyvalue"
_ "github.com/observiq/stanza/operator/builtin/parser/regex"
_ "github.com/observiq/stanza/operator/builtin/parser/severity"
_ "github.com/observiq/stanza/operator/builtin/parser/syslog"
Expand Down
101 changes: 101 additions & 0 deletions docs/operators/key_value_parser.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
## `key_value_parser` operator

The `key_value_parser` operator parses the string-type field selected by `parse_from` into key value pairs. All values are of type string.

### Configuration Fields

| Field | Default | Description |
| --- | --- | --- |
| `id` | `key_value_parser` | A unique identifier for the operator |
| `output` | Next in pipeline | The connected operator(s) that will receive all outbound entries |
| `parse_from` | $ | A [field](/docs/types/field.md) that indicates the field to be parsed into key value pairs |
| `parse_to` | $ | A [field](/docs/types/field.md) that indicates the field to be parsed as into key value pairs |
| `preserve_to` | | Preserves the unparsed value at the specified [field](/docs/types/field.md) |
| `on_error` | `send` | The behavior of the operator if it encounters an error. See [on_error](/docs/types/on_error.md) |
| `if` | | An [expression](/docs/types/expression.md) that, when set, will be evaluated to determine whether this operator should be used for the given entry. This allows you to do easy conditional parsing without branching logic with routers. |
| `timestamp` | `nil` | An optional [timestamp](/docs/types/timestamp.md) block which will parse a timestamp field before passing the entry to the output operator |
| `severity` | `nil` | An optional [severity](/docs/types/severity.md) block which will parse a severity field before passing the entry to the output operator |


### Example Configurations


#### Parse the field `message` into key value pairs

Configuration:
```yaml
- type: key_value_parser
parse_from: message
```

<table>
<tr><td> Input record </td> <td> Output record </td></tr>
<tr>
<td>

```json
{
"timestamp": "",
"record": {
"message": "name=stanza"
}
}
```

</td>
<td>

```json
{
"timestamp": "",
"record": {
"name": "stanza"
}
}
```

</td>
</tr>
</table>

#### Parse the field `message` as key value pairs, and parse the timestamp

Configuration:
```yaml
- type: key_value_parser
parse_from: message
timestamp:
parse_from: seconds_since_epoch
layout_type: epoch
layout: s
```

<table>
<tr><td> Input record </td> <td> Output record </td></tr>
<tr>
<td>

```json
{
"timestamp": "",
"record": {
"message": "name=stanza seconds_since_epoch=1136214245"
}
}
```

</td>
<td>

```json
{
"timestamp": "2006-01-02T15:04:05-07:00",
"record": {
"name": "stanza"
}
}
```

</td>
</tr>
</table>
2 changes: 2 additions & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ require (
github.com/cenkalti/backoff/v4 v4.1.1
github.com/elastic/go-elasticsearch/v7 v7.13.0
github.com/golang/protobuf v1.5.2
github.com/hashicorp/go-multierror v1.1.0
github.com/hashicorp/go-uuid v1.0.2
github.com/jpillora/backoff v1.0.0
github.com/json-iterator/go v1.1.11
Expand Down Expand Up @@ -124,6 +125,7 @@ require (
github.com/gookit/color v1.2.5 // indirect
github.com/gostaticanalysis/analysisutil v0.0.3 // indirect
github.com/grpc-ecosystem/grpc-gateway v1.16.0 // indirect
github.com/hashicorp/errwrap v1.0.0 // indirect
github.com/hashicorp/go-version v1.2.0 // indirect
github.com/hashicorp/hcl v1.0.0 // indirect
github.com/inconshreveable/mousetrap v1.0.0 // indirect
Expand Down
2 changes: 2 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -785,6 +785,7 @@ github.com/hashicorp/consul/sdk v0.1.1/go.mod h1:VKf9jXwCTEY1QZP2MOLRhb5i/I/ssyN
github.com/hashicorp/consul/sdk v0.3.0/go.mod h1:VKf9jXwCTEY1QZP2MOLRhb5i/I/ssyNV1vwHyQBF0x8=
github.com/hashicorp/consul/sdk v0.6.0/go.mod h1:fY08Y9z5SvJqevyZNy6WWPXiG3KwBPAvlcdx16zZ0fM=
github.com/hashicorp/errwrap v0.0.0-20141028054710-7554cd9344ce/go.mod h1:YH+1FKiLXxHSkmPseP+kNlulaMuP3n2brvKWEqk/Jc4=
github.com/hashicorp/errwrap v1.0.0 h1:hLrqtEDnRye3+sgx6z4qVLNuviH3MR5aQ0ykNJa/UYA=
github.com/hashicorp/errwrap v1.0.0/go.mod h1:YH+1FKiLXxHSkmPseP+kNlulaMuP3n2brvKWEqk/Jc4=
github.com/hashicorp/go-cleanhttp v0.5.0/go.mod h1:JpRdi6/HCYpAwUzNwuwqhbovhLtngrth3wmdIIUrZ80=
github.com/hashicorp/go-cleanhttp v0.5.1/go.mod h1:JpRdi6/HCYpAwUzNwuwqhbovhLtngrth3wmdIIUrZ80=
Expand All @@ -798,6 +799,7 @@ github.com/hashicorp/go-msgpack v0.5.3/go.mod h1:ahLV/dePpqEmjfWmKiqvPkv/twdG7iP
github.com/hashicorp/go-msgpack v0.5.5/go.mod h1:ahLV/dePpqEmjfWmKiqvPkv/twdG7iPBM1vqhUKIvfM=
github.com/hashicorp/go-multierror v0.0.0-20161216184304-ed905158d874/go.mod h1:JMRHfdO9jKNzS/+BTlxCjKNQHg/jZAft8U7LloJvN7I=
github.com/hashicorp/go-multierror v1.0.0/go.mod h1:dHtQlpGsu+cZNNAkkCN/P3hoUDHhCYQXV3UM06sGGrk=
github.com/hashicorp/go-multierror v1.1.0 h1:B9UzwGQJehnUY1yNrnwREHc3fGbC2xefo8g4TbElacI=
github.com/hashicorp/go-multierror v1.1.0/go.mod h1:spPvp8C1qA32ftKqdAHm4hHTbPw+vmowP0z+KUhOZdA=
github.com/hashicorp/go-plugin v1.0.1/go.mod h1:++UyYGoz3o5w9ZzAdZxtQKrWWP+iqPBn3cQptSMzBuY=
github.com/hashicorp/go-retryablehttp v0.5.3/go.mod h1:9B5zBasrRhHXnJnui7y6sL7es7NDiJgTc6Er0maI1Xs=
Expand Down
4 changes: 4 additions & 0 deletions license.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,10 @@ exceptions:
# MPL is approved as long as the source is not modified
- path: "github.com/hashicorp/go-uuid"
licenses: ["MPL-2.0"]
- path: "github.com/hashicorp/errwrap"
licenses: ["MPL-2.0"]
- path: "github.com/hashicorp/go-multierror"
licenses: ["MPL-2.0"]

# ISC
- path: "github.com/davecgh/go-spew"
Expand Down
123 changes: 123 additions & 0 deletions operator/builtin/parser/keyvalue/keyvalue.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
package keyvalue

import (
"context"
"fmt"
"strings"

"github.com/observiq/stanza/entry"
"github.com/observiq/stanza/operator"
"github.com/observiq/stanza/operator/helper"

"github.com/hashicorp/go-multierror"
)

func init() {
operator.Register("key_value_parser", func() operator.Builder { return NewKVParserConfig("") })
}

// NewKVParserConfig creates a new key value parser config with default values
func NewKVParserConfig(operatorID string) *KVParserConfig {
return &KVParserConfig{
ParserConfig: helper.NewParserConfig(operatorID, "key_value_parser"),
Delimiter: "=",
}
}

// KVParserConfig is the configuration of a key value parser operator.
type KVParserConfig struct {
helper.ParserConfig `yaml:",inline"`

Delimiter string `json:"delimiter" yaml:"delimiter"`
}

// Build will build a key value parser operator.
func (c KVParserConfig) Build(context operator.BuildContext) ([]operator.Operator, error) {
parserOperator, err := c.ParserConfig.Build(context)
if err != nil {
return nil, err
}

if len(c.Delimiter) == 0 {
return nil, fmt.Errorf("delimiter is a required parameter")
}

kvParser := &KVParser{
ParserOperator: parserOperator,
delimiter: c.Delimiter,
}

return []operator.Operator{kvParser}, nil
}

// KVParser is an operator that parses key value pairs.
type KVParser struct {
helper.ParserOperator
delimiter string
}

// Process will parse an entry for key value pairs.
func (kv *KVParser) Process(ctx context.Context, entry *entry.Entry) error {
return kv.ParserOperator.ProcessWith(ctx, entry, kv.parse)
}

// parse will parse a value as key values.
func (kv *KVParser) parse(value interface{}) (interface{}, error) {
switch m := value.(type) {
case string:
return kv.parser(m, kv.delimiter)
case []byte:
return kv.parser(string(m), kv.delimiter)
default:
return nil, fmt.Errorf("type %T cannot be parsed as key value pairs", value)
}
}

func (kv *KVParser) parser(input string, delimiter string) (map[string]interface{}, error) {
if len(input) == 0 {
return nil, fmt.Errorf("parse from field %s is empty", kv.ParseFrom.String())
}

parsed := make(map[string]interface{})

var err error
for _, raw := range splitStringByWhitespace(input) {
m := strings.Split(raw, delimiter)
if len(m) != 2 {
e := fmt.Errorf("expected '%s' to split by '%s' into two items, got %d", raw, delimiter, len(m))
err = multierror.Append(err, e)
continue
}

key := cleanString(m[0])
value := cleanString(m[1])

// TODO: Check if key already exists and fail if so?
parsed[key] = value
}

return parsed, err
}

// split on whitespace and preserve quoted text
func splitStringByWhitespace(input string) []string {
quoted := false
raw := strings.FieldsFunc(input, func(r rune) bool {
if r == '"' {
quoted = !quoted
}
return !quoted && r == ' '
})
return raw
}

// trim leading and trailing space
func cleanString(input string) string {
if len(input) > 0 && input[0] == '"' {
input = input[1:]
}
if len(input) > 0 && input[len(input)-1] == '"' {
input = input[:len(input)-1]
}
return strings.TrimSpace(input)
}
Loading