Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stage 2 changes for RFC 0009 - data_stream fields #1307

Merged
merged 8 commits into from
Apr 19, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.next.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ Thanks, you're awesome :-) -->

#### Added

* Add `data_stream` fieldset. #1307
* Add `orchestrator` fieldset as beta fields. #1326
* Extend `threat.*` experimental fields with proposed changes from RFC 0018. #1344, #1351

Expand Down
67 changes: 67 additions & 0 deletions code/go/ecs/data_stream.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

88 changes: 88 additions & 0 deletions docs/field-details.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -1017,6 +1017,94 @@ example: `docker`

|=====

[[ecs-data_stream]]
=== Data Stream Fields

The data_stream fields take part in defining the new data stream naming scheme.

In the new data stream naming scheme the value of the data stream fields combine to the name of the actual data stream in the following manner: `{data_stream.type}-{data_stream.dataset}-{data_stream.namespace}`. This means the fields can only contain characters that are valid as part of names of data streams. More details about this can be found in this https://www.elastic.co/blog/an-introduction-to-the-elastic-data-stream-naming-scheme[blog post].

An Elasticsearch data stream consists of one or more backing indices, and a data stream name forms part of the backing indices names. Due to this convention, data streams must also follow index naming restrictions. For example, data stream names cannot include `\`, `/`, `*`, `?`, `"`, `<`, `>`, `|`, ` ` (space character), `,`, or `#`. Please see the Elasticsearch reference for additional https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html#indices-create-api-path-params[restrictions].

beta::[ These fields are in beta and are subject to change.]

[discrete]
==== Data Stream Field Details

[options="header"]
|=====
| Field | Description | Level

// ===============================================================

|
[[field-data-stream-dataset]]
<<field-data-stream-dataset, data_stream.dataset>>

| The field can contain anything that makes sense to signify the source of the data.

Examples include `nginx.access`, `prometheus`, `endpoint` etc. For data streams that otherwise fit, but that do not have dataset set we use the value "generic" for the dataset value. `event.dataset` should have the same value as `data_stream.dataset`.

Beyond the Elasticsearch data stream naming criteria noted above, the `dataset` value has additional restrictions:

* Must not contain `-`

* No longer than 100 characters

type: constant_keyword



example: `nginx.access`

| extended

// ===============================================================

|
[[field-data-stream-namespace]]
<<field-data-stream-namespace, data_stream.namespace>>

| A user defined namespace. Namespaces are useful to allow grouping of data.

Many users already organize their indices this way, and the data stream naming scheme now provides this best practice as a default. Many users will populate this field with `default`. If no value is used, it falls back to `default`.

Beyond the Elasticsearch index naming criteria noted above, `namespace` value has the additional restrictions:

* Must not contain `-`

* No longer than 100 characters

type: constant_keyword



example: `production`

| extended

// ===============================================================

|
[[field-data-stream-type]]
<<field-data-stream-type, data_stream.type>>

| An overarching type for the data stream.

Currently allowed values are "logs" and "metrics". We expect to also add "traces" and "synthetics" in the near future.

type: constant_keyword



example: `logs`

| extended

// ===============================================================

|=====

[[ecs-destination]]
=== Destination Fields

Expand Down
2 changes: 2 additions & 0 deletions docs/fields.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ all fields are defined.

| <<ecs-container,Container>> | Fields describing the container that generated this event.

| <<ecs-data_stream,Data Stream>> | The data_stream fields take part in defining the new data stream naming scheme.

| <<ecs-destination,Destination>> | Fields about the destination side of a network connection, used with source.

| <<ecs-dll,DLL>> | These fields contain information about code libraries dynamically loaded into processes.
Expand Down
6 changes: 3 additions & 3 deletions experimental/generated/beats/fields.ecs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -632,16 +632,16 @@
naming scheme.

In the new data stream naming scheme the value of the data stream fields combine
to the name of the actual data stream in the following manner `{data_stream.type}-{data_stream.dataset}-{data_stream.namespace}`.
to the name of the actual data stream in the following manner: `{data_stream.type}-{data_stream.dataset}-{data_stream.namespace}`.
This means the fields can only contain characters that are valid as part of
names of data streams. More details about this can be found in this https://www.elastic.co/blog/an-introduction-to-the-elastic-data-stream-naming-scheme[blog
post].

An Elasticsearch data stream consists of one or more backing indices, and a
data stream name forms part of the backing indices names. Due to this convention,
data streams must also follow index naming restrictions. For example, data stream
names cannot include \, /, *, ?, ", <, >, |, ` `. Please see the Elasticsearch
reference for additional https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html#indices-create-api-path-params[restrictions].'
names cannot include `\`, `/`, `*`, `?`, `"`, `<`, `>`, `|`, ` ` (space character),
`,`, or `#`. Please see the Elasticsearch reference for additional https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html#indices-create-api-path-params[restrictions].'
type: group
fields:
- name: dataset
Expand Down
7 changes: 4 additions & 3 deletions experimental/generated/ecs/ecs_nested.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1057,20 +1057,21 @@ container:
title: Container
type: group
data_stream:
beta: These fields are in beta and are subject to change.
description: 'The data_stream fields take part in defining the new data stream naming
scheme.

In the new data stream naming scheme the value of the data stream fields combine
to the name of the actual data stream in the following manner `{data_stream.type}-{data_stream.dataset}-{data_stream.namespace}`.
to the name of the actual data stream in the following manner: `{data_stream.type}-{data_stream.dataset}-{data_stream.namespace}`.
This means the fields can only contain characters that are valid as part of names
of data streams. More details about this can be found in this https://www.elastic.co/blog/an-introduction-to-the-elastic-data-stream-naming-scheme[blog
post].

An Elasticsearch data stream consists of one or more backing indices, and a data
stream name forms part of the backing indices names. Due to this convention, data
streams must also follow index naming restrictions. For example, data stream names
cannot include \, /, *, ?, ", <, >, |, ` `. Please see the Elasticsearch reference
for additional https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html#indices-create-api-path-params[restrictions].'
cannot include `\`, `/`, `*`, `?`, `"`, `<`, `>`, `|`, ` ` (space character),
`,`, or `#`. Please see the Elasticsearch reference for additional https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html#indices-create-api-path-params[restrictions].'
fields:
data_stream.dataset:
dashed_name: data-stream-dataset
Expand Down
4 changes: 2 additions & 2 deletions experimental/generated/elasticsearch/template.json
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
"ecs_2.0.0-dev-exp_client",
"ecs_2.0.0-dev-exp_cloud",
"ecs_2.0.0-dev-exp_container",
"ecs_2.0.0-dev-exp_data_stream",
"ecs_2.0.0-dev-exp_destination",
"ecs_2.0.0-dev-exp_dll",
"ecs_2.0.0-dev-exp_dns",
Expand Down Expand Up @@ -38,8 +39,7 @@
"ecs_2.0.0-dev-exp_url",
"ecs_2.0.0-dev-exp_user",
"ecs_2.0.0-dev-exp_user_agent",
"ecs_2.0.0-dev-exp_vulnerability",
"ecs_2.0.0-dev-exp_data_stream"
"ecs_2.0.0-dev-exp_vulnerability"
],
"index_patterns": [
"try-ecs-*"
Expand Down
52 changes: 52 additions & 0 deletions generated/beats/fields.ecs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -634,6 +634,58 @@
ignore_above: 1024
description: Runtime managing this container.
example: docker
- name: data_stream
title: Data Stream
group: 2
description: 'The data_stream fields take part in defining the new data stream
naming scheme.

In the new data stream naming scheme the value of the data stream fields combine
to the name of the actual data stream in the following manner: `{data_stream.type}-{data_stream.dataset}-{data_stream.namespace}`.
This means the fields can only contain characters that are valid as part of
names of data streams. More details about this can be found in this https://www.elastic.co/blog/an-introduction-to-the-elastic-data-stream-naming-scheme[blog
post].

An Elasticsearch data stream consists of one or more backing indices, and a
data stream name forms part of the backing indices names. Due to this convention,
data streams must also follow index naming restrictions. For example, data stream
names cannot include `\`, `/`, `*`, `?`, `"`, `<`, `>`, `|`, ` ` (space character),
`,`, or `#`. Please see the Elasticsearch reference for additional https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html#indices-create-api-path-params[restrictions].'
type: group
fields:
- name: dataset
level: extended
type: constant_keyword
description: "The field can contain anything that makes sense to signify the\
\ source of the data.\nExamples include `nginx.access`, `prometheus`, `endpoint`\
\ etc. For data streams that otherwise fit, but that do not have dataset set\
\ we use the value \"generic\" for the dataset value. `event.dataset` should\
\ have the same value as `data_stream.dataset`.\nBeyond the Elasticsearch\
\ data stream naming criteria noted above, the `dataset` value has additional\
\ restrictions:\n * Must not contain `-`\n * No longer than 100 characters"
example: nginx.access
default_field: false
- name: namespace
level: extended
type: constant_keyword
description: "A user defined namespace. Namespaces are useful to allow grouping\
\ of data.\nMany users already organize their indices this way, and the data\
\ stream naming scheme now provides this best practice as a default. Many\
\ users will populate this field with `default`. If no value is used, it falls\
\ back to `default`.\nBeyond the Elasticsearch index naming criteria noted\
\ above, `namespace` value has the additional restrictions:\n * Must not\
\ contain `-`\n * No longer than 100 characters"
example: production
default_field: false
- name: type
level: extended
type: constant_keyword
description: 'An overarching type for the data stream.

Currently allowed values are "logs" and "metrics". We expect to also add "traces"
and "synthetics" in the near future.'
example: logs
default_field: false
- name: destination
title: Destination
group: 2
Expand Down
3 changes: 3 additions & 0 deletions generated/csv/fields.csv
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,9 @@ ECS_Version,Indexed,Field_Set,Field,Type,Level,Normalization,Example,Description
2.0.0-dev,true,container,container.labels,object,extended,,,Image labels.
2.0.0-dev,true,container,container.name,keyword,extended,,,Container name.
2.0.0-dev,true,container,container.runtime,keyword,extended,,docker,Runtime managing this container.
2.0.0-dev,true,data_stream,data_stream.dataset,constant_keyword,extended,,nginx.access,The field can contain anything that makes sense to signify the source of the data.
2.0.0-dev,true,data_stream,data_stream.namespace,constant_keyword,extended,,production,A user defined namespace. Namespaces are useful to allow grouping of data.
2.0.0-dev,true,data_stream,data_stream.type,constant_keyword,extended,,logs,An overarching type for the data stream.
2.0.0-dev,true,destination,destination.address,keyword,extended,,,Destination network address.
2.0.0-dev,true,destination,destination.as.number,long,extended,,15169,Unique number allocated to the autonomous system.
2.0.0-dev,true,destination,destination.as.organization.name,keyword,extended,,Google LLC,Organization name.
Expand Down
46 changes: 46 additions & 0 deletions generated/ecs/ecs_flat.yml
Original file line number Diff line number Diff line change
Expand Up @@ -756,6 +756,52 @@ container.runtime:
normalize: []
short: Runtime managing this container.
type: keyword
data_stream.dataset:
dashed_name: data-stream-dataset
description: "The field can contain anything that makes sense to signify the source\
\ of the data.\nExamples include `nginx.access`, `prometheus`, `endpoint` etc.\
\ For data streams that otherwise fit, but that do not have dataset set we use\
\ the value \"generic\" for the dataset value. `event.dataset` should have the\
\ same value as `data_stream.dataset`.\nBeyond the Elasticsearch data stream naming\
\ criteria noted above, the `dataset` value has additional restrictions:\n *\
\ Must not contain `-`\n * No longer than 100 characters"
example: nginx.access
flat_name: data_stream.dataset
level: extended
name: dataset
normalize: []
short: The field can contain anything that makes sense to signify the source of
the data.
type: constant_keyword
data_stream.namespace:
dashed_name: data-stream-namespace
description: "A user defined namespace. Namespaces are useful to allow grouping\
\ of data.\nMany users already organize their indices this way, and the data stream\
\ naming scheme now provides this best practice as a default. Many users will\
\ populate this field with `default`. If no value is used, it falls back to `default`.\n\
Beyond the Elasticsearch index naming criteria noted above, `namespace` value\
\ has the additional restrictions:\n * Must not contain `-`\n * No longer than\
\ 100 characters"
example: production
flat_name: data_stream.namespace
level: extended
name: namespace
normalize: []
short: A user defined namespace. Namespaces are useful to allow grouping of data.
type: constant_keyword
data_stream.type:
dashed_name: data-stream-type
description: 'An overarching type for the data stream.

Currently allowed values are "logs" and "metrics". We expect to also add "traces"
and "synthetics" in the near future.'
example: logs
flat_name: data_stream.type
level: extended
name: type
normalize: []
short: An overarching type for the data stream.
type: constant_keyword
destination.address:
dashed_name: destination-address
description: 'Some event destination addresses are defined ambiguously. The event
Expand Down
Loading