Phase 2 in conversion to asciidoc (elastic#334)

* Phase 2 in conversion to asciidoc * Update docs/convert.asciidoc Trying the new suggest feature for changing v7.1 to v7.0. Co-Authored-By: karenzone <[email protected]> * Start incorporating review comments * Update faq.asciidoc Fix typo in anchor
webmat · Mar 5, 2019 · 1fbb02d · 1fbb02d
1 parent d40f1af
commit 1fbb02d
Show file tree

Hide file tree

Showing 9 changed files with 253 additions and 77 deletions.
diff --git a/docs/conventions.asciidoc b/docs/conventions.asciidoc
@@ -1,7 +1,7 @@
 //[[ecs-conventions]]
 == {ecs} Conventions
 
-{ecs} is most effective when you understand and follow conventions.
+{ecs} is most effective when you understand and follow these guidelines and conventions.
 
 [float]
 === Multi-fields text indexing
@@ -10,15 +10,13 @@ Elasticsearch can index text using:
 
 * *Text.* Text indexing allows for full text search, or searching arbitrary words that
   are part of the field. 
-  See https://www.elastic.co/guide/en/elasticsearch/reference/current/text.html[Text datatype]
-  in the {es} Reference Guide.
+  See {ref}/text.html[Text datatype] in the {es} Reference Guide.
 * *Keywords.* Keyword indexing offers faster exact match filtering and prefix search,
-  and makes aggregations (for Kibana visualizations) possible.
+  and makes aggregations (for {kib} visualizations) possible.
   See the {es} Reference Guide for more information on 
-  https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html[exact match filtering],
-  https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-prefix-query.html[prefix search], or 
-  https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations.html[aggregations].
-
+  {ref}/query-dsl-term-query.html[exact match filtering],
+  {ref}/query-dsl-prefix-query.html[prefix search], or 
+  {ref}/search-aggregations.html[aggregations].
 
 [float]
 ==== Default Elasticsearch convention
@@ -40,7 +38,7 @@ For monitoring use cases, `keyword` indexing is needed almost exclusively, with
 full text search on very few fields. Given this premise, ECS defaults
 all text indexing to `keyword` at the top level (with very few exceptions).
 Any use case that requires full text search indexing on additional fields
-https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html[multi-field].
+can add a {ref}/multi-fields.html[multi-field]
 for full text search. Doing so does not conflict with ECS,
 as the canonical field name will remain `keyword` indexed.
 
@@ -63,7 +61,7 @@ follow the multi-field convention where `text` indexing is nested in the multi-f
 [float]
 === IDs and most codes are keywords, not integers
 
-Despite the fact that IDs and codes (e.g. error codes) are often integers,
+Despite the fact that IDs and codes (such as error codes) are often integers,
 this is not always the case.
 Since we want to make it possible to map as many systems and data sources
 to ECS as possible, we default to using the `keyword` type for IDs and codes.

diff --git a/docs/convert.asciidoc b/docs/convert.asciidoc
@@ -0,0 +1,52 @@
+[[convert-to-ecs]]
+== Converting an implementation to ECS
+
+A common schema helps you correlate and use data from various sources. 
+
+Fields for most Elastic modules and solutions (version 7.0 and later) are mapped
+to the Elastic Common Schema. You may want to map data from other
+implementations to ECS to help you correlate data across all of your products
+and solutions.
+
+Before you start a conversion, be sure that you understand the basics below.
+
+[float]
+[[core-or-ext]]
+=== Core and extended fields
+
+* *Core fields.* Fields that are most common across all use cases are called *core fields*. 
++
+These generalized fields are used by analysis content
+(searches, visualizations, dashboards, alerts, machine learning jobs, reports)
+across use cases. Analysis content designed to operate on these
+fields should work properly on data from any relevant source. 
++
+Focus on populating these fields first. 
+
+* *Extended fields.* Any field that is not a core field is called an *extended field*. 
+Extended fields may apply to more narrow use cases, or may be more open
+to interpretation depending on the use case. Extended fields are more likely to
+change over time.
+
+Each {ecs} <<ecs-fields,field>> in a table is identified as core or extended.
+
+[float]
+[[ecs-comv]]
+=== An approach to mapping an existing implementation
+
+Here's the recommended approach for converting an existing implementation to {ecs}.
+
+. Start with Core fields.
++
+Populate core fields first. Look at your set of event fields, and find
+the appropriate ECS field name for each one. 
+
+. Move on to Extended fields.
++
+Map fields that may be specific to various data sources using {ecs} extended
+fields. Look at {ecs} extended fields, and decide how to populate these fields
+with the data you have available. Even if you have already mapped a field to an
+{ecs} core field, you can still map it to an extended field. 
+
+Populating both core and extended fields helps ensure reusability of ECS analysis
+content. 
diff --git a/docs/faq.asciidoc b/docs/faq.asciidoc
@@ -0,0 +1,104 @@
+[[ecs-faq]]
+== FAQ
+
+[float]
+[[ecs-benefits]]
+=== What are the benefits of using ECS?
+
+The benefits to a user adopting these fields and names in their clusters are:
+
+* **Data correlation.** Ability to easily correlate data from the same or different sources, including:
+** data from metrics, logs, and application performance management (apm) tools
+** data from the same machines/hosts
+** data from the same service
+* **Ease of recall.** Improved ability to remember commonly used field names (because there is a single set, not a set per data source)
+* **Ease of deduction.** Improved ability to deduce field names (because the field naming follows a small number of rules with few exceptions)
+* **Reuse.** Ability to re-use analysis content (searches, visualizations, dashboards, alerts, reports, and machine learning jobs) across multiple data sources
+* **Future proofing.** Ability to use any future Elastic-provided analysis content in your environment without modifications
+
+[float]
+[[conflict]]
+=== What if I have fields that conflict with ECS?
+
+The
+{ref}/rename-processor.html[rename
+processor] can help you resolve field conflicts. For example, imagine that you
+already have a field called "user," but ECS employs `user` as an object. You can
+use the rename processor on ingest time to rename your field to the matching ECS
+field. If your field does not match ECS, you can rename your field to
+`user.value` instead.
+
+[float]
+[[addl-fields]]
+=== What if my events have additional fields?
+
+Events may contain fields in addition to ECS fields. These fields can follow the
+ECS naming and writing rules, but this is not a requirement.
+
+[float]
+[[dot-notation]]
+=== Why does ECS use a dot notation instead of an underline notation?
+
+There are two common key formats for ingesting data into Elasticsearch:
+
+* Dot notation: `user.firstname: Nicolas`, `user.lastname: Ruflin`
+* Underline notation: `user_firstname: Nicolas`, `user_lastname: Ruflin`
+
+ECS uses the dot notation to represent nested objects. 
+
+[float]
+[[notation-diff]]
+==== What is the difference between the two notations?
+
+Ingesting `user.firstname: Nicolas` and `user.lastname: Ruflin` is identical to ingesting the following JSON:
+
+```
+"user": {
+  "firstname": "Nicolas",
+  "lastname": "Ruflin"
+}
+```
+
+In Elasticsearch, `user` is represented as an {ref}/object.html[object
+datatype]. In the case of the underline notation, both are just
+{ref}/mapping-types.html[string datatypes].
+
+NOTE: ECS does not use {ref}/nested.html[nested
+datatypes], which are arrays of objects.
+
+[float]
+[[dot-adv]]
+==== Advantages of dot notation
+
+With dot notation, each prefix in Elasticsearch is an object. Each object can have
+{ref}/object.html#object-params[parameters]
+that control how fields inside the object are treated. In the context of ECS,
+for example, these parameters would allow you to disable dynamic property
+creation for certain prefixes.
+
+Individual objects give you more flexibility on both the ingest and the event
+sides. In Elasticsearch, for example, you can use the remove processor to drop
+complete objects instead of selecting each key inside. You don't have to know
+ahead of time which keys will be in an object.
+
+In Beats, you can simplify the creation of events. For example, you can treat
+each object as an object (or struct in Golang), which makes constructing and
+modifying each part of the final event easier.
+
+[float]
+[[dot-disadv]]
+==== Disadvantage of dot notation
+
+In Elasticsearch, each key can have only one type. For example, if `user` is an
+`object`, you can't use it as a `keyword` type in the same index, like `{"user":
+"nicolas ruflin"}`. This restriction can be an issue in certain datasets. For
+the ECS data itself, this is not an issue because all fields are predefined.
+
+[float]
+[[underline]]
+==== What if I already use the underline notation?
+
+As long as there are no conflicts, underline notation and ECS dot notation can
+coexist in the same document.
+
+
diff --git a/docs/fields-gen.asciidoc b/docs/fields-gen.asciidoc
@@ -1,22 +1,65 @@
 [[ecs-base]]
 === Base fields
 
-The base set contains all fields which are on the top level. These fields are common across all types of events.
+The `base` set contains top level fields that are common across all types of events
+(such as `@timestamp`, `tags`, `message`, and `labels`).
 
-[cols="<,<,<,<,<",options="header",]
-|=======================================================================
-| Field  | Description  | Level  | Type  | Example 
-| [[@timestamp]] | Date/time when the event originated. 
+
+[options="header"]
+|=====
+| Field  | Description  | Level 
+
+// ===============================================================
+
+| [[@timestamp]] 
+| Date/time when the event originated. 
 For log events this is the date/time when the event was generated, and not when it was read.
-Required field for all events. | core | date | `2016-05-23T08:05:34.853Z` 
-| tags | List of keywords used to tag each event. | core | keyword | `["production", "env2"]` 
-| labels | Key/value pairs.
+Required field for all events. 
+
+type: date
+
+Example: `2016-05-23T08:05:34.853Z` 
+
+| core 
+
+// ===============================================================
+
+| tags 
+| List of keywords used to tag each event.
+
+type: keyword
+
+Example:`["production", "env2"]` 
+
+| core  
+
+// ===============================================================
+
+| labels 
+| Key/value pairs.
 Can be used to add meta information to events. Should not contain nested objects. 
 All values are stored as keyword.
-Example: `docker` and `k8s` labels. | core | object | `{'application': 'foo-bar', 'env': 'production'}` 
-| message | For log events the message field contains the log message.
-In other use cases the message field can be used to concatenate different values which are then freely searchable. If multiple messages exist, they can be combined into one message. | core | text | `Hello World` 
-|=======================================================================
+
+type: object
+
+Examples: `docker` and `k8s` labels. Examples: `{'application': 'foo-bar', 'env': 'production'}` 
+
+| core 
+
+// ===============================================================
+
+| message 
+| For log events the message field contains the log message.
+In other use cases the message field can be used to concatenate different values
+which are then freely searchable. If multiple messages exist, they can be
+combined into one message. 
+
+type: text
+
+Example: `Hello World` 
+
+| core 
+|=====
 
 
 [[ecs-agent]]

diff --git a/docs/fields.asciidoc b/docs/fields.asciidoc
@@ -1,16 +1,17 @@
 [[ecs-fields]]
 == {ecs} Fields
 
-// Add a list of field types w/ brief description so user can get a 
-// sense of what we're offering without having to scroll.
-
+// Set make to generate short description for use in the Fields overview section.
 // Pull in generated field content using `include` statements
 
+[float]
+[[ecs-categories]]
+=== Field categories
 [cols="<,<",options="header",]
 |=======================================================================
 | Fields  | Description  
-| <<ecs-base,Base>> | The base set contains all fields which are on the top level.
- These fields are common across all types of events.
+| <<ecs-base,Base>> | Top level fields that are common across all types of events
+(such as `@timestamp`, `tags`, `message`, and `labels`).
 | <<ecs-agent,Agent>> | The agent fields contain data about the 
 agent/client/shipper that created the event.
 | <<ecs-cloud,Cloud>> | Fields related to the cloud or infrastructure the events are

diff --git a/docs/glossary.asciidoc b/docs/glossary.asciidoc
@@ -2,8 +2,7 @@
 == Glossary of {ecs} Terms
 
 [[glossary-ecs]] 
-ECS ::
-
+ECS::
 Elastic Common Schema. A common set of document fields, field names, and their respective entity
 relationships to be used in the storage of log messages and other data in
 Elasticsearch.

diff --git a/docs/guidelines.asciidoc b/docs/guidelines.asciidoc
@@ -8,11 +8,10 @@ practices.
 === General guidelines
 
 * The document MUST have the `@timestamp` field.
-* Use the https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html[data type]
+* Use the {ref}/mapping-types.html[data types]
   defined for an ECS field.
 * Use the `ecs.version` field to define which version of ECS is used.
 * Map as many fields as possible to ECS.
-* TBD: Include guidelines on when people should contribute to the spec. Link to Contributing.
 
 [float]
 ==== Guidelines for writing fields
@@ -28,6 +27,15 @@ practices.
 * *Singular or plural.* Use singular and plural names properly to reflect the field content. For example, use `requests_per_sec` rather than `request_per_sec`.
 * *General to specific.* Organise the prefixes from general to specific to allow grouping fields into objects with a prefix like `host.*`.
 * *Avoid repetition.* Avoid stuttering of words. If part of the field name is already in the prefix, do not repeat it. Example: `host.host_ip` should be `host.ip`.
-* *Use prefixes.* Fields must be prefixed except for the base fields. For example all `host` fields are prefixed with `host.`. See `dot` notation in FAQ for more details.
+* *Use prefixes.* Fields must be prefixed except for the base fields. For example, all `host` fields are prefixed with `host.`. 
+See <<dot-notation>> for more details.
++
+The document structure should be nested JSON objects. If you use Beats or
+Logstash, the nesting of JSON objects is done for you automatically. If you're
+ingesting to Elasticsearch using the API, your fields must be nested
+objects, not strings containing dots.
+
 * *Avoid abbreviations when possible*. A few exceptions like `ip` exist.
 
+
+