elastic · ebeahan · Apr 19, 2021 · Mar 18, 2021 · Mar 18, 2021 · Mar 18, 2021
diff --git a/CHANGELOG.next.md b/CHANGELOG.next.md
@@ -16,6 +16,7 @@ Thanks, you're awesome :-) -->
 
 #### Added
 
+* Add `data_stream` fieldset. #1307
 * Add `orchestrator` fieldset as beta fields. #1326
 * Extend `threat.*` experimental fields with proposed changes from RFC 0018. #1344, #1351
 

diff --git a/code/go/ecs/data_stream.go b/code/go/ecs/data_stream.go
diff --git a/docs/field-details.asciidoc b/docs/field-details.asciidoc
@@ -1017,6 +1017,94 @@ example: `docker`
 
 |=====
 
+[[ecs-data_stream]]
+=== Data Stream Fields
+
+The data_stream fields take part in defining the new data stream naming scheme.
+
+In the new data stream naming scheme the value of the data stream fields combine to the name of the actual data stream in the following manner: `{data_stream.type}-{data_stream.dataset}-{data_stream.namespace}`. This means the fields can only contain characters that are valid as part of names of data streams. More details about this can be found in this https://www.elastic.co/blog/an-introduction-to-the-elastic-data-stream-naming-scheme[blog post].
+
+An Elasticsearch data stream consists of one or more backing indices, and a data stream name forms part of the backing indices names. Due to this convention, data streams must also follow index naming restrictions. For example, data stream names cannot include `\`, `/`, `*`, `?`, `"`, `<`, `>`, `|`, ` ` (space character), `,`, or `#`. Please see the Elasticsearch reference for additional https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html#indices-create-api-path-params[restrictions].
+
+beta::[ These fields are in beta and are subject to change.]
+
+[discrete]
+==== Data Stream Field Details
+
+[options="header"]
+|=====
+| Field  | Description | Level
+
+// ===============================================================
+
+|
+[[field-data-stream-dataset]]
+<<field-data-stream-dataset, data_stream.dataset>>
+
+| The field can contain anything that makes sense to signify the source of the data.
+
+Examples include `nginx.access`, `prometheus`, `endpoint` etc. For data streams that otherwise fit, but that do not have dataset set we use the value "generic" for the dataset value. `event.dataset` should have the same value as `data_stream.dataset`.
+
+Beyond the Elasticsearch data stream naming criteria noted above, the `dataset` value has additional restrictions:
+
+  * Must not contain `-`
+
+  * No longer than 100 characters
+
+type: constant_keyword
+
+
+
+example: `nginx.access`
+
+| extended
+
+// ===============================================================
+
+|
+[[field-data-stream-namespace]]
+<<field-data-stream-namespace, data_stream.namespace>>
+
+| A user defined namespace. Namespaces are useful to allow grouping of data.
+
+Many users already organize their indices this way, and the data stream naming scheme now provides this best practice as a default. Many users will populate this field with `default`. If no value is used, it falls back to `default`.
+
+Beyond the Elasticsearch index naming criteria noted above, `namespace` value has the additional restrictions:
+
+  * Must not contain `-`
+
+  * No longer than 100 characters
+
+type: constant_keyword
+
+
+
+example: `production`
+
+| extended
+
+// ===============================================================
+
+|
+[[field-data-stream-type]]
+<<field-data-stream-type, data_stream.type>>
+
+| An overarching type for the data stream.
+
+Currently allowed values are "logs" and "metrics". We expect to also add "traces" and "synthetics" in the near future.
+
+type: constant_keyword
+
+
+
+example: `logs`
+
+| extended
+
+// ===============================================================
+
+|=====
+
 [[ecs-destination]]
 === Destination Fields
 

diff --git a/docs/fields.asciidoc b/docs/fields.asciidoc
@@ -32,6 +32,8 @@ all fields are defined.
 
 | <<ecs-container,Container>> | Fields describing the container that generated this event.
 
+| <<ecs-data_stream,Data Stream>> | The data_stream fields take part in defining the new data stream naming scheme.
+
 | <<ecs-destination,Destination>> | Fields about the destination side of a network connection, used with source.
 
 | <<ecs-dll,DLL>> | These fields contain information about code libraries dynamically loaded into processes.

diff --git a/experimental/generated/beats/fields.ecs.yml b/experimental/generated/beats/fields.ecs.yml
@@ -632,16 +632,16 @@
       naming scheme.
 
       In the new data stream naming scheme the value of the data stream fields combine
-      to the name of the actual data stream in the following manner `{data_stream.type}-{data_stream.dataset}-{data_stream.namespace}`.
+      to the name of the actual data stream in the following manner: `{data_stream.type}-{data_stream.dataset}-{data_stream.namespace}`.
       This means the fields can only contain characters that are valid as part of
       names of data streams. More details about this can be found in this https://www.elastic.co/blog/an-introduction-to-the-elastic-data-stream-naming-scheme[blog
       post].
 
       An Elasticsearch data stream consists of one or more backing indices, and a
       data stream name forms part of the backing indices names. Due to this convention,
       data streams must also follow index naming restrictions. For example, data stream
-      names cannot include \, /, *, ?, ", <, >, |, ` `. Please see the Elasticsearch
-      reference for additional https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html#indices-create-api-path-params[restrictions].'
+      names cannot include `\`, `/`, `*`, `?`, `"`, `<`, `>`, `|`, ` ` (space character),
+      `,`, or `#`. Please see the Elasticsearch reference for additional https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html#indices-create-api-path-params[restrictions].'
     type: group
     fields:
     - name: dataset

diff --git a/experimental/generated/ecs/ecs_nested.yml b/experimental/generated/ecs/ecs_nested.yml
@@ -1057,20 +1057,21 @@ container:
   title: Container
   type: group
 data_stream:
+  beta: These fields are in beta and are subject to change.
   description: 'The data_stream fields take part in defining the new data stream naming
     scheme.
 
     In the new data stream naming scheme the value of the data stream fields combine
-    to the name of the actual data stream in the following manner `{data_stream.type}-{data_stream.dataset}-{data_stream.namespace}`.
+    to the name of the actual data stream in the following manner: `{data_stream.type}-{data_stream.dataset}-{data_stream.namespace}`.
     This means the fields can only contain characters that are valid as part of names
     of data streams. More details about this can be found in this https://www.elastic.co/blog/an-introduction-to-the-elastic-data-stream-naming-scheme[blog
     post].
 
     An Elasticsearch data stream consists of one or more backing indices, and a data
     stream name forms part of the backing indices names. Due to this convention, data
     streams must also follow index naming restrictions. For example, data stream names
-    cannot include \, /, *, ?, ", <, >, |, ` `. Please see the Elasticsearch reference
-    for additional https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html#indices-create-api-path-params[restrictions].'
+    cannot include `\`, `/`, `*`, `?`, `"`, `<`, `>`, `|`, ` ` (space character),
+    `,`, or `#`. Please see the Elasticsearch reference for additional https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html#indices-create-api-path-params[restrictions].'
   fields:
     data_stream.dataset:
       dashed_name: data-stream-dataset

diff --git a/experimental/generated/elasticsearch/template.json b/experimental/generated/elasticsearch/template.json
@@ -9,6 +9,7 @@
     "ecs_2.0.0-dev-exp_client",
     "ecs_2.0.0-dev-exp_cloud",
     "ecs_2.0.0-dev-exp_container",
+    "ecs_2.0.0-dev-exp_data_stream",
     "ecs_2.0.0-dev-exp_destination",
     "ecs_2.0.0-dev-exp_dll",
     "ecs_2.0.0-dev-exp_dns",
@@ -38,8 +39,7 @@
     "ecs_2.0.0-dev-exp_url",
     "ecs_2.0.0-dev-exp_user",
     "ecs_2.0.0-dev-exp_user_agent",
-    "ecs_2.0.0-dev-exp_vulnerability",
-    "ecs_2.0.0-dev-exp_data_stream"
+    "ecs_2.0.0-dev-exp_vulnerability"
   ],
   "index_patterns": [
     "try-ecs-*"

diff --git a/generated/beats/fields.ecs.yml b/generated/beats/fields.ecs.yml
@@ -634,6 +634,58 @@
       ignore_above: 1024
       description: Runtime managing this container.
       example: docker
+  - name: data_stream
+    title: Data Stream
+    group: 2
+    description: 'The data_stream fields take part in defining the new data stream
+      naming scheme.
+
+      In the new data stream naming scheme the value of the data stream fields combine
+      to the name of the actual data stream in the following manner: `{data_stream.type}-{data_stream.dataset}-{data_stream.namespace}`.
+      This means the fields can only contain characters that are valid as part of
+      names of data streams. More details about this can be found in this https://www.elastic.co/blog/an-introduction-to-the-elastic-data-stream-naming-scheme[blog
+      post].
+
+      An Elasticsearch data stream consists of one or more backing indices, and a
+      data stream name forms part of the backing indices names. Due to this convention,
+      data streams must also follow index naming restrictions. For example, data stream
+      names cannot include `\`, `/`, `*`, `?`, `"`, `<`, `>`, `|`, ` ` (space character),
+      `,`, or `#`. Please see the Elasticsearch reference for additional https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html#indices-create-api-path-params[restrictions].'
+    type: group
+    fields:
+    - name: dataset
+      level: extended
+      type: constant_keyword
+      description: "The field can contain anything that makes sense to signify the\
+        \ source of the data.\nExamples include `nginx.access`, `prometheus`, `endpoint`\
+        \ etc. For data streams that otherwise fit, but that do not have dataset set\
+        \ we use the value \"generic\" for the dataset value. `event.dataset` should\
+        \ have the same value as `data_stream.dataset`.\nBeyond the Elasticsearch\
+        \ data stream naming criteria noted above, the `dataset` value has additional\
+        \ restrictions:\n  * Must not contain `-`\n  * No longer than 100 characters"
+      example: nginx.access
+      default_field: false
+    - name: namespace
+      level: extended
+      type: constant_keyword
+      description: "A user defined namespace. Namespaces are useful to allow grouping\
+        \ of data.\nMany users already organize their indices this way, and the data\
+        \ stream naming scheme now provides this best practice as a default. Many\
+        \ users will populate this field with `default`. If no value is used, it falls\
+        \ back to `default`.\nBeyond the Elasticsearch index naming criteria noted\
+        \ above, `namespace` value has the additional restrictions:\n  * Must not\
+        \ contain `-`\n  * No longer than 100 characters"
+      example: production
+      default_field: false
+    - name: type
+      level: extended
+      type: constant_keyword
+      description: 'An overarching type for the data stream.
+
+        Currently allowed values are "logs" and "metrics". We expect to also add "traces"
+        and "synthetics" in the near future.'
+      example: logs
+      default_field: false
   - name: destination
     title: Destination
     group: 2

diff --git a/generated/csv/fields.csv b/generated/csv/fields.csv
@@ -64,6 +64,9 @@ ECS_Version,Indexed,Field_Set,Field,Type,Level,Normalization,Example,Description
 2.0.0-dev,true,container,container.labels,object,extended,,,Image labels.
 2.0.0-dev,true,container,container.name,keyword,extended,,,Container name.
 2.0.0-dev,true,container,container.runtime,keyword,extended,,docker,Runtime managing this container.
+2.0.0-dev,true,data_stream,data_stream.dataset,constant_keyword,extended,,nginx.access,The field can contain anything that makes sense to signify the source of the data.
+2.0.0-dev,true,data_stream,data_stream.namespace,constant_keyword,extended,,production,A user defined namespace. Namespaces are useful to allow grouping of data.
+2.0.0-dev,true,data_stream,data_stream.type,constant_keyword,extended,,logs,An overarching type for the data stream.
 2.0.0-dev,true,destination,destination.address,keyword,extended,,,Destination network address.
 2.0.0-dev,true,destination,destination.as.number,long,extended,,15169,Unique number allocated to the autonomous system.
 2.0.0-dev,true,destination,destination.as.organization.name,keyword,extended,,Google LLC,Organization name.

diff --git a/generated/ecs/ecs_flat.yml b/generated/ecs/ecs_flat.yml
@@ -756,6 +756,52 @@ container.runtime:
   normalize: []
   short: Runtime managing this container.
   type: keyword
+data_stream.dataset:
+  dashed_name: data-stream-dataset
+  description: "The field can contain anything that makes sense to signify the source\
+    \ of the data.\nExamples include `nginx.access`, `prometheus`, `endpoint` etc.\
+    \ For data streams that otherwise fit, but that do not have dataset set we use\
+    \ the value \"generic\" for the dataset value. `event.dataset` should have the\
+    \ same value as `data_stream.dataset`.\nBeyond the Elasticsearch data stream naming\
+    \ criteria noted above, the `dataset` value has additional restrictions:\n  *\
+    \ Must not contain `-`\n  * No longer than 100 characters"
+  example: nginx.access
+  flat_name: data_stream.dataset
+  level: extended
+  name: dataset
+  normalize: []
+  short: The field can contain anything that makes sense to signify the source of
+    the data.
+  type: constant_keyword
+data_stream.namespace:
+  dashed_name: data-stream-namespace
+  description: "A user defined namespace. Namespaces are useful to allow grouping\
+    \ of data.\nMany users already organize their indices this way, and the data stream\
+    \ naming scheme now provides this best practice as a default. Many users will\
+    \ populate this field with `default`. If no value is used, it falls back to `default`.\n\
+    Beyond the Elasticsearch index naming criteria noted above, `namespace` value\
+    \ has the additional restrictions:\n  * Must not contain `-`\n  * No longer than\
+    \ 100 characters"
+  example: production
+  flat_name: data_stream.namespace
+  level: extended
+  name: namespace
+  normalize: []
+  short: A user defined namespace. Namespaces are useful to allow grouping of data.
+  type: constant_keyword
+data_stream.type:
+  dashed_name: data-stream-type
+  description: 'An overarching type for the data stream.
+
+    Currently allowed values are "logs" and "metrics". We expect to also add "traces"
+    and "synthetics" in the near future.'
+  example: logs
+  flat_name: data_stream.type
+  level: extended
+  name: type
+  normalize: []
+  short: An overarching type for the data stream.
+  type: constant_keyword
 destination.address:
   dashed_name: destination-address
   description: 'Some event destination addresses are defined ambiguously. The event