Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to define routing rules in data streams #535

Merged
merged 32 commits into from
Jun 22, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
8b85491
Add other file spec for routing rules
mrodm Jun 12, 2023
513c6a7
Add changelog entry
mrodm Jun 12, 2023
8b01b29
Move routing_rules definition to datastream manifest
mrodm Jun 12, 2023
921b9e8
update test packages
mrodm Jun 13, 2023
a2dc265
Mark feature as technical preview
mrodm Jun 13, 2023
76f5a02
Add comment
mrodm Jun 13, 2023
4c8a7e2
Remove unnecessary file definition
mrodm Jun 13, 2023
5587723
Rename
mrodm Jun 14, 2023
88cb7cb
Add description
mrodm Jun 14, 2023
fae5b7a
Add string and array for dataset and namespace
mrodm Jun 14, 2023
4ee7587
Merge remote upstream/main into add_routing_rules
mrodm Jun 15, 2023
e1c83a5
Move routing_rules as a new file in datastream
mrodm Jun 15, 2023
8cb7bd7
Add license header
mrodm Jun 15, 2023
adcbc33
Merge remote upstream/main into add_routing_rules
mrodm Jun 19, 2023
989b8c1
Add descriptions
mrodm Jun 19, 2023
71ba320
Add comment for JSON patch
mrodm Jun 19, 2023
79fe3ad
Remove routing_rules definition from manifest.spec.yml
mrodm Jun 19, 2023
5165fb6
Merge remote upstream/main into add_routing_rules
mrodm Jun 20, 2023
a07d8ee
Add JSON patch to remove routing_rules
mrodm Jun 20, 2023
6fd616e
Add new test package
mrodm Jun 20, 2023
b5fd187
Change routing rules definition to be an array
mrodm Jun 20, 2023
56671f0
Update validation rule and test packages
mrodm Jun 20, 2023
45141b0
Add comment about technical preview
mrodm Jun 20, 2023
76d7cf8
Merge remote upstream/main into add_routing_rules
mrodm Jun 20, 2023
8eb0781
Update JSON path to read routing rules array
mrodm Jun 20, 2023
35951da
Merge remote upstream/main into add_routing_rules
mrodm Jun 21, 2023
2450b63
Rename dataset fields in routing_rules file
mrodm Jun 21, 2023
f8abd12
Add check if there is no routing rule file
mrodm Jun 21, 2023
6081bf8
Add technical preview annotations
mrodm Jun 21, 2023
420ccba
Apply suggestions from code review
mrodm Jun 21, 2023
4a36c59
Add comment why routing rules is an array
mrodm Jun 21, 2023
3277059
Remove minItems from routingRules
mrodm Jun 21, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions code/go/internal/validator/semantic/types_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,11 @@ func TestListFieldsFiles(t *testing.T) {
fullFilePath: "../../../../../test/packages/good_v2/data_stream/pe/fields/some_fields.yml",
dataStream: "pe",
},
fieldFileMetadata{
filePath: "data_stream/routing_rules/fields/base-fields.yml",
fullFilePath: "../../../../../test/packages/good_v2/data_stream/routing_rules/fields/base-fields.yml",
dataStream: "routing_rules",
},
fieldFileMetadata{
filePath: "data_stream/skipped_tests/fields/base-fields.yml",
fullFilePath: "../../../../../test/packages/good_v2/data_stream/skipped_tests/fields/base-fields.yml",
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
// Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
// or more contributor license agreements. Licensed under the Elastic License;
// you may not use this file except in compliance with the Elastic License.

package semantic

import (
"fmt"
"io/fs"
"path"

"gopkg.in/yaml.v3"

ve "github.com/elastic/package-spec/v2/code/go/internal/errors"
"github.com/elastic/package-spec/v2/code/go/internal/fspath"
"github.com/elastic/package-spec/v2/code/go/internal/pkgpath"
)

// ValidateRoutingRulesAndDataset returns validation errors if there are routing rules defined in any dataStream
// but that dataStream does not defines "dataset" field.
func ValidateRoutingRulesAndDataset(fsys fspath.FS) ve.ValidationErrors {
dataStreams, err := listDataStreams(fsys)
if err != nil {
return ve.ValidationErrors{err}
}

var errs ve.ValidationErrors
for _, dataStream := range dataStreams {
anyRoutingRules, err := anyRoutingRulesInDataStream(fsys, dataStream)
if !anyRoutingRules {
continue
}
err = validateDatasetInDataStream(fsys, dataStream)
if err != nil {
errs.Append(ve.ValidationErrors{fmt.Errorf("routing rules defined in data stream %q but dataset field is missing: %w", dataStream, err)})
}
}
return errs
}

func validateDatasetInDataStream(fsys fspath.FS, dataStream string) error {
manifestPath := path.Join("data_stream", dataStream, "manifest.yml")
d, err := fs.ReadFile(fsys, manifestPath)
if err != nil {
return fmt.Errorf("failed to read data stream manifest in %q: %w", fsys.Path(manifestPath), err)
}

var manifest struct {
Dataset string `yaml:"dataset,omitempty"`
}
err = yaml.Unmarshal(d, &manifest)
if err != nil {
return fmt.Errorf("failed to parse data stream manifest in %q: %w", fsys.Path(manifestPath), err)
}

if manifest.Dataset == "" {
return fmt.Errorf("dataset field is required in data stream %q", dataStream)
}
return nil
}

func anyRoutingRulesInDataStream(fsys fspath.FS, dataStream string) (bool, error) {
routingRulesPath := path.Join("data_stream", dataStream, "routing_rules.yml")
f, err := pkgpath.Files(fsys, routingRulesPath)
if err != nil {
return false, nil
}

if len(f) == 0 {
return false, nil
}

if len(f) != 1 {
return false, fmt.Errorf("single routing rules expected")
}

vals, err := f[0].Values("$[*]")
if err != nil {
return false, fmt.Errorf("can't read routing_rules: %w", err)
}

rules, ok := vals.([]interface{})
if !ok {
return false, fmt.Errorf("routing rules conversion error")
}
if len(rules) > 0 {
return true, nil
}
return false, nil
}
1 change: 1 addition & 0 deletions code/go/internal/validator/spec.go
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,7 @@ func (s Spec) rules(pkgType string, rootSpec spectypes.ItemSpec) validationRules
{fn: semantic.ValidateILMPolicyPresent, since: semver.MustParse("2.0.0"), types: []string{"integration"}},
{fn: semantic.ValidateProfilingNonGA, types: []string{"integration"}},
{fn: semantic.ValidateKibanaObjectIDs, types: []string{"integration"}},
{fn: semantic.ValidateRoutingRulesAndDataset, types: []string{"integration"}, since: semver.MustParse("2.9.0")},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added here the since, to avoid checking for routing rules before 2.9.0.

It has been added a JSON Patch to remove it before 2.9.0.

}

var validationRules validationRules
Expand Down
32 changes: 32 additions & 0 deletions code/go/pkg/validator/validator_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -578,6 +578,38 @@ func TestValidateExternalFieldsWithoutDevFolder(t *testing.T) {
}
})
}
}

func TestValidateRoutingRules(t *testing.T) {
tests := map[string][]string{
"good": []string{},
"good_v2": []string{},
"bad_routing_rules": []string{
`routing rules defined in data stream "rules" but dataset field is missing: dataset field is required in data stream "rules"`,
},
"bad_routing_rules_wrong_spec": []string{
`item [routing_rules.yml] is not allowed in folder [../../../../test/packages/bad_routing_rules_wrong_spec/data_stream/rules]`,
},
}

for pkgName, expectedErrorMessages := range tests {
t.Run(pkgName, func(t *testing.T) {
err := ValidateFromPath(path.Join("..", "..", "..", "..", "test", "packages", pkgName))
if len(expectedErrorMessages) == 0 {
assert.NoError(t, err)
return
}
assert.Error(t, err)

errs, ok := err.(errors.ValidationErrors)
require.True(t, ok)
assert.Len(t, errs, len(expectedErrorMessages))

for _, foundError := range errs {
require.Contains(t, expectedErrorMessages, foundError.Error())
}
})
}

}

Expand Down
3 changes: 3 additions & 0 deletions spec/changelog.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@
- description: Add ability to specify secret vars
type: enhancement
link: https://github.com/elastic/package-spec/pull/339
- description: Add support to define routing rules in data streams (technical preview)
type: enhancement
link: https://github.com/elastic/package-spec/pull/535
- version: 2.8.1
changes:
- description: Add validation for data types of metrics
Expand Down
76 changes: 76 additions & 0 deletions spec/integration/data_stream/routing_rules.spec.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
##
## Describes the specification for a routing rules yml file
##
spec:
# Everything under here follows JSON schema (https://json-schema.org/), written as YAML for readability
definitions:
routing_rule:
description: Routing rule definition (technical preview)
type: object
properties:
target_dataset:
description: >
Field references or a static value for the dataset part of the data stream name.
In addition to the criteria for index names, cannot contain - and must be no longer than 100 characters.
Example values are nginx.access and nginx.error.

Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces).
When resolving field references, the processor replaces invalid characters with _.
Uses the <dataset> part of the index name as a fallback if all field references resolve to a null, missing, or non-string value
anyOf:
- type: string
- type: array
items:
type: string
examples:
- nginx.error
- nginx
if:
description: Conditionally execute the processor
type: string
examples:
- "ctx?.file?.path?.contains('/var/log/nginx/error')"
- "ctx?.container?.image?.name == 'nginx'"
namespace:
description: >
Field references or a static value for the namespace part of the data stream name.
See the criteria for index names for allowed characters. Must be no longer than 100 characters.

Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces).
When resolving field references, the processor replaces invalid characters with _.
Uses the <namespace> part of the index name as a fallback if all field references resolve to a null, missing, or non-string value.
anyOf:
- type: string
- type: array
items:
type: string
items:
type: string
examples:
- default
- "{{ labels.dasta_stream.namespace}}"
required:
- target_dataset
- if
- namespace
# this is not an object because using the source dataset as key would require to support keys with dots.
# keys with dots are expanded here https://github.com/elastic/package-spec/blob/66abf8992f3ab7e9dd0b833e4ab9b43fc8b16471/code/go/internal/yamlschema/loader.go#L92
type: array
description: Routing rules set.
items:
type: object
additionalProperties: false
properties:
source_dataset:
description: >
Source dataset to be used by this reroute processsor.
If applicable, documents from this dataset will be routed according to the rules defined.
type: string
rules:
description: List of routing rules
type: array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. Maybe add a comment here mentioning that this is not an object because using the source dataset as key would require to support keys with dots.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, better to keep it as a comment

items:
$ref: "#/definitions/routing_rule"
mrodm marked this conversation as resolved.
Show resolved Hide resolved
required:
- source_dataset
- rules
13 changes: 13 additions & 0 deletions spec/integration/data_stream/spec.yml
Original file line number Diff line number Diff line change
Expand Up @@ -78,3 +78,16 @@ spec:
required: false
visibility: private
$ref: "./_dev/spec.yml"
- description: File containing routing rules definitions (technical preview)
type: file
contentMediaType: "application/x-yaml"
name: "routing_rules.yml"
required: false
$ref: "./routing_rules.spec.yml"

# TODO add JSON patch to remove routing_rules.yml entry
versions:
- before: 2.9.0
patch:
- op: remove
path: "/contents/0/contents/7" # remove routing_rules file definition
93 changes: 93 additions & 0 deletions test/packages/bad_routing_rules/LICENSE.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
Elastic License 2.0

URL: https://www.elastic.co/licensing/elastic-license

## Acceptance

By using the software, you agree to all of the terms and conditions below.

## Copyright License

The licensor grants you a non-exclusive, royalty-free, worldwide,
non-sublicensable, non-transferable license to use, copy, distribute, make
available, and prepare derivative works of the software, in each case subject to
the limitations and conditions below.

## Limitations

You may not provide the software to third parties as a hosted or managed
service, where the service provides users with access to any substantial set of
the features or functionality of the software.

You may not move, change, disable, or circumvent the license key functionality
in the software, and you may not remove or obscure any functionality in the
software that is protected by the license key.

You may not alter, remove, or obscure any licensing, copyright, or other notices
of the licensor in the software. Any use of the licensor’s trademarks is subject
to applicable law.

## Patents

The licensor grants you a license, under any patent claims the licensor can
license, or becomes able to license, to make, have made, use, sell, offer for
sale, import and have imported the software, in each case subject to the
limitations and conditions in this license. This license does not cover any
patent claims that you cause to be infringed by modifications or additions to
the software. If you or your company make any written claim that the software
infringes or contributes to infringement of any patent, your patent license for
the software granted under these terms ends immediately. If your company makes
such a claim, your patent license ends immediately for work on behalf of your
company.

## Notices

You must ensure that anyone who gets a copy of any part of the software from you
also gets a copy of these terms.

If you modify the software, you must include in any modified copies of the
software prominent notices stating that you have modified the software.

## No Other Rights

These terms do not imply any licenses other than those expressly granted in
these terms.

## Termination

If you use the software in violation of these terms, such use is not licensed,
and your licenses will automatically terminate. If the licensor provides you
with a notice of your violation, and you cease all violation of this license no
later than 30 days after you receive that notice, your licenses will be
reinstated retroactively. However, if you violate these terms after such
reinstatement, any additional violation of these terms will cause your licenses
to terminate automatically and permanently.

## No Liability

*As far as the law allows, the software comes as is, without any warranty or
condition, and the licensor will not be liable to you for any damages arising
out of these terms or the use or nature of the software, under any kind of
legal claim.*

## Definitions

The **licensor** is the entity offering these terms, and the **software** is the
software the licensor makes available under these terms, including any portion
of it.

**you** refers to the individual or entity agreeing to these terms.

**your company** is any legal entity, sole proprietorship, or other kind of
organization that you work for, plus all organizations that have control over,
are under the control of, or are under common control with that
organization. **control** means ownership of substantially all the assets of an
entity, or the power to direct its management and policies by vote, contract, or
otherwise. Control can be direct or indirect.

**your licenses** are all the licenses granted to you for the software under
these terms.

**use** means anything you do with the software requiring one of your licenses.

**trademark** means trademarks, service marks, and similar rights.
6 changes: 6 additions & 0 deletions test/packages/bad_routing_rules/changelog.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# newer versions go on top
- version: "0.0.1"
changes:
- description: Initial draft of the package
type: enhancement
link: https://github.com/elastic/integrations/pull/1 # FIXME Replace with the real PR link
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
paths:
{{#each paths as |path i|}}
- {{path}}
{{/each}}
exclude_files: [".gz$"]
processors:
- add_locale: ~
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
description: Pipeline for processing sample logs
processors:
- set:
field: sample_field
value: "1"
on_failure:
- set:
field: error.message
value: '{{ _ingest.on_failure_message }}'
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
- name: data_stream.type
type: constant_keyword
description: Data stream type.
- name: data_stream.dataset
type: constant_keyword
description: Data stream dataset.
- name: data_stream.namespace
type: constant_keyword
description: Data stream namespace.
- name: '@timestamp'
type: date
description: Event timestamp.
Loading