[DOCS] Update Data Visualizer details in machine learning tutorial (#…

…1505)
elastic · Jan 12, 2021 · ab499b3 · ab499b3
1 parent d0fd365
commit ab499b3
Show file tree

Hide file tree

Showing 5 changed files with 21 additions and 38 deletions.
diff --git a/docs/en/stack/ml/get-started/images/ml-gs-data-ip.jpg b/docs/en/stack/ml/get-started/images/ml-gs-data-ip.jpg
diff --git a/docs/en/stack/ml/get-started/images/ml-gs-data-keyword.jpg b/docs/en/stack/ml/get-started/images/ml-gs-data-keyword.jpg
diff --git a/docs/en/stack/ml/get-started/images/ml-gs-data-metric.jpg b/docs/en/stack/ml/get-started/images/ml-gs-data-metric.jpg
diff --git a/docs/en/stack/ml/get-started/images/ml-gs-data-timestamp.jpg b/docs/en/stack/ml/get-started/images/ml-gs-data-timestamp.jpg
diff --git a/docs/en/stack/ml/get-started/ml-gs-visualizer.asciidoc b/docs/en/stack/ml/get-started/ml-gs-visualizer.asciidoc
@@ -27,55 +27,38 @@ exploring. Alternatively, click
 *Use full kibana_sample_data_logs data* to view the full time range of data.
 
 . Optional: Change the sample size, which is the number of documents per shard
-that are used in the visualizations. There is a relatively small number of
-documents in the sample data, so you can choose a value of `all`. For larger
-data sets, keep in mind that using a large sample size increases query run times
-and increases the load on the cluster.
+that are used in the {data-viz}. There is a relatively small number of
+documents in the {kib} sample data, so you can choose a value of `all`. For
+larger data sets, keep in mind that using a large sample size increases query
+run times and increases the load on the cluster.
 
-. Explore the fields and metrics in the {data-viz}.
+. Explore the fields in the {data-viz}.
 +
 --
-It lists the fields in two sections. The first section contains
-the numeric ("metric") data types. The second section contains non-numeric data
-types (such as `keyword`, `text`, `date`, `boolean`, `ip`, and `geo_point`). For
-more information, see {ref}/mapping-types.html[Field data types].
-
-For each metric, the {data-viz} indicates how many documents contain the field
-in the selected time period. It also provides information about the minimum,
-median, and maximum values, the number of distinct values, and their
-distribution. You can use the distribution chart to get a better idea of how
-the values in the data are clustered. Alternatively, you can view the top values
-for metric fields. For example:
-
-[role="screenshot"]
-image::images/ml-gs-data-metric.jpg["{data-viz} output for top values in {kib}", width="50%",role="screenshot left"]
+You can filter the list by field names or {ref}/mapping-types.html[field types].
+The {data-viz} indicates how many of the documents in the sample for the
+selected time period contain each field.
 
 In particular, look at the `clientip`, `response.keyword`, and `url.keyword`
-fields, since we'll use them in our {anomaly-jobs}. For
-{ref}/ip.html[`ip`] and {ref}/keyword.html[`keyword`] fields, the {data-viz}
-provides the number of distinct values, a list of the top values, and the number
-and percentage of documents that contain the field during the selected time
-period. For example:
+fields, since we'll use them in our {anomaly-jobs}. For these fields, the
+{data-viz} provides the number of distinct values, a list of the top values, and
+the number and percentage of documents that contain the field. For example:
 
 [role="screenshot"]
-image:images/ml-gs-data-keyword.jpg["{data-viz} output for keyword fields in {kib}", width="50%",role="screenshot left"]
+image::images/ml-gs-data-keyword.jpg["{data-viz} output for ip and keyword fields"]
 
-[role="screenshot"]
-image:images/ml-gs-data-ip.jpg["{data-viz} output for ip fields in {kib}", width="50%",role="screenshot left"]
+For numeric fields, the {data-viz} provides information about the minimum,
+median, maximum, and top values, the number of distinct values, and their 
+distribution. You can use the distribution chart to get a better idea of how the 
+values in the data are clustered. For example:
 
---
+[role="screenshot"]
+image::images/ml-gs-data-metric.jpg["{data-viz} for sample web logs"]
 
-. Make note of the range of dates in the `@timestamp` field. They are relative
-to when you added the sample data and you'll need that information later in the
-tutorial.
-+
---
-For {ref}/date.html[`date`] fields, the {data-viz} provides the earliest and
-latest field values and the number and percentage of documents that contain the
-field during the selected time period:
+TIP: Make note of the range of dates in the `@timestamp` field. They are
+relative to when you added the sample data and you'll need that information
+later in the tutorial.
 
-[role="screenshot"]
-image:images/ml-gs-data-timestamp.jpg["{data-viz} output for date fields in {kib}",width="50%",role="screenshot left"]
 --
 
 Now that you're familiar with the data in the `kibana_sample_data_logs` index,