Skip to content

Commit

Permalink
Merge branch 'master' into uptime_edit-alerts-ahead
Browse files Browse the repository at this point in the history
  • Loading branch information
justinkambic committed Jun 4, 2020
2 parents 4af3d19 + c77e9bf commit 18846c9
Show file tree
Hide file tree
Showing 226 changed files with 8,705 additions and 1,788 deletions.
2 changes: 1 addition & 1 deletion docs/apm/advanced-queries.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ or, to only show transactions that are slower than a specified time threshold.
==== Example APM app queries

* Exclude response times slower than 2000 ms: `transaction.duration.us > 2000000`
* Filter by response status code: `context.response.status_code >= 400`
* Filter by response status code: `context.response.status_code 400`
* Filter by single user ID: `context.user.id : 12`

When querying in the APM app, you're merely searching and selecting data from fields in Elasticsearch documents.
Expand Down
18 changes: 9 additions & 9 deletions docs/apm/service-maps.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -62,9 +62,9 @@ Machine learning jobs can be created to calculate anomaly scores on APM transact
When these jobs are active, service maps will display a color-coded anomaly indicator based on the detected anomaly score:

[horizontal]
image:apm/images/green-service.png[APM green service]:: Max anomaly score **<=25**. Service is healthy.
image:apm/images/green-service.png[APM green service]:: Max anomaly score **25**. Service is healthy.
image:apm/images/yellow-service.png[APM yellow service]:: Max anomaly score **26-74**. Anomalous activity detected. Service may be degraded.
image:apm/images/red-service.png[APM red service]:: Max anomaly score **>=75**. Anomalous activity detected. Service is unhealthy.
image:apm/images/red-service.png[APM red service]:: Max anomaly score **75**. Anomalous activity detected. Service is unhealthy.

[role="screenshot"]
image::apm/images/apm-service-map-anomaly.png[Example view of anomaly scores on service maps in the APM app]
Expand Down Expand Up @@ -92,10 +92,10 @@ Type and subtype are based on `span.type`, and `span.subtype`.
Service maps are supported for the following Agent versions:

[horizontal]
Go Agent:: >= v1.7.0
Java Agent:: >= v1.13.0
.NET Agent:: >= v1.3.0
Node.js Agent:: >= v3.6.0
Python Agent:: >= v5.5.0
Ruby Agent:: >= v3.6.0
Real User Monitoring (RUM) Agent:: >= v4.7.0
Go Agent:: v1.7.0
Java Agent:: v1.13.0
.NET Agent:: v1.3.0
Node.js Agent:: v3.6.0
Python Agent:: v5.5.0
Ruby Agent:: v3.6.0
Real User Monitoring (RUM) Agent:: v4.7.0
2 changes: 1 addition & 1 deletion docs/infrastructure/infra-ui.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -109,5 +109,5 @@ Depending on the features you have installed and configured, you may also be abl

* Select *View APM* to <<traces, view APM traces>> in the *APM* app.

* Select *View Uptime* to <<uptime-overview, view uptime information>> in the *Uptime* app.
* Select *View Uptime* to {uptime-guide}/uptime-app-overview.html[view uptime information] in the *Uptime* app.

2 changes: 1 addition & 1 deletion docs/logs/using.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -96,5 +96,5 @@ When the machine learning anomaly detection features are enabled, click *Log rat
To see other actions related to the event, click *Actions* in the log event details.
Depending on the event and the features you have configured, you may also be able to:

* Select *View status in Uptime* to <<uptime-overview, view related uptime information>> in the *Uptime* app.
* Select *View status in Uptime* to {uptime-guide}/uptime-app-overview.html[view related uptime information] in the *Uptime* app.
* Select *View in APM* to <<traces, view related APM traces>> in the *APM* app.
Original file line number Diff line number Diff line change
@@ -1,30 +1,33 @@
[role="xpack"]
[[uptime-alerting]]

== Uptime alerting
=== Uptime alerting

The Uptime app integrates with Kibana's {kibana-ref}/alerting-getting-started.html[alerting and actions]
feature. It provides a set of built-in actions and Uptime specific threshold alerts for you to use
and enables central management of all alerts from <<management, Kibana Management>>.
and enables central management of all alerts from {kibana-ref}/management.html[Kibana Management].

[role="screenshot"]
image::images/create-alert.png[Create alert]

[float]
=== Monitor status alerts
==== Monitor status alerts

To receive alerts when a monitor goes down, use the alerting menu at the top of the
overview page. Use a query in the alert flyout to determine which monitors to check
with your alert. If you already have a query in the overview page search bar it will
be carried over into this box.

[role="screenshot"]
image::uptime/images/monitor-status-alert-flyout.png[Create monitor status alert flyout]
image::images/monitor-status-alert.png[Create monitor status alert flyout]

[float]
=== TLS alerts
==== TLS alerts

Uptime also provides the ability to create an alert that will notify you when one or
more of your monitors have a TLS certificate that will expire within some threshold,
or when its age exceeds a limit. The values for these thresholds are configurable on
the <<uptime-settings, Settings page>>.

[role="screenshot"]
image::uptime/images/tls-alert-flyout.png[Create TLS alert flyout]
image::images/tls-alert.png[Create TLS alert flyout]
70 changes: 70 additions & 0 deletions docs/uptime-guide/app-overview.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
[role="xpack"]
[[uptime-app]]
== Uptime app

The Uptime app in {kib} enables you to monitor the status of network endpoints via HTTP/S, TCP, and ICMP.
You can explore endpoint status over time, drill down into specific monitors,
and view a high-level snapshot of your environment at any point in time.

[role="screenshot"]
image::images/uptime-overview.png[Uptime app overview]

[role="xpack"]
[[uptime-app-overview]]
=== Overview

The Uptime overview helps you quickly identify and diagnose outages and
other connectivity issues within your network or environment. You can use the date range
selection that is global to the Uptime app, to highlight
an absolute date range, or a relative one, similar to other areas of {kib}.

[float]
=== Filter bar

The Filter bar enables you to quickly view specific groups of monitors, or even
an individual monitor if you have defined many.

This control allows you to use automated filter options, as well as input custom filter
text to select specific monitors by field, URL, ID, and other attributes.

[role="screenshot"]
image::images/filter-bar.png[Filter bar]

[float]
=== Snapshot panel

The Snapshot panel displays the overall
status of the environment you're monitoring or a subset of those monitors.
You can see the total number of detected monitors within the selected
Uptime date range, along with the number of monitors
in an `up` or `down` state, which is based on the last check reported by Heartbeat
for each monitor.

Next to the counts, there is a histogram displaying the change over time throughout the
selected date range.

[role="screenshot"]
image::images/snapshot-view.png[Snapshot view]

[float]
=== Monitor list

Information about individual monitors is displayed in the monitor list and provides a quick
way to navigate to a more in-depth visualization for interesting hosts or endpoints.

The information displayed includes the recent status of a host or endpoint, when the monitor was last checked, its
ID and URL, and its IP address. There is also sparkline showing its check status over time.

[role="screenshot"]
image::images/monitor-list.png[Monitor list]

[float]
=== Observability integrations

The Monitor list also contains a menu of available integrations. When Uptime detects Kubernetes or
Docker related host information, it provides links to open the Metrics app or Logs app pre-filtered
for this host. Additionally, to help you quickly determine if these solutions contain data relevant to you,
this feature contains links to filter the other views on the host's IP address.

[role="screenshot"]
image::images/observability_integrations.png[Observability integrations]
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
[role="xpack"]
[[uptime-certificates]]

== Certificates
=== Certificates

[role="screenshot"]
image::uptime/images/certificates-page.png[Certificates]

The certificates page allows you to visualize TLS certificate data in your indices. In addition to the
The certificates page enables you to visualize TLS certificate data in your indices. In addition to the
common name, associated monitors, issuer information, and SHA fingerprints, Uptime also assigns a status
derived from the threshold values in the <<uptime-settings, Settings page>>.

Several of the columns on this page are sortable. You can use the search bar at the top of the view
to find values in most of the TLS-related fields in your Uptime indices. Additionally, you can
create a TLS alert using the `Alerts` dropdown at the top of the page.
to find values in most of the TLS-related fields in your Uptime indices. Additionally, using the `Alerts`
dropdown at the top of the page you can create a TLS alert.

[role="screenshot"]
image::images/certificates-page.png[Certificates]
18 changes: 10 additions & 8 deletions docs/uptime-guide/deployment-arch.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,24 @@

There are multiple ways to deploy Uptime and Heartbeat.
Use the information in this section to determine the best deployment for you.
A guiding principle is that an outage that takes down the service being monitored should not also take down Heartbeat.
You want Heartbeat to be functioning even when your service is not, so the guidelines here help you maximise this possibility.
A guiding principle is that when an outage takes down the service being monitored it should not also take down Heartbeat.
You want Heartbeat to be functioning even when your service is not, so the guidelines here help you maximize this possibility.

Heartbeat is generally run as a centralized service within a data center.
Heartbeat is commonly run as a centralized service within a data center.
While it is possible to run it as a separate "sidecar" process paired with each process/container, we recommend against it.
Running Heartbeat centrally ensures you will still be able to see monitoring data in the event of an overloaded, disconnected, or otherwise malfunctioning server.

For further redundancy, you may want to deploy multiple Heartbeats across geographic and/or network boundaries to provide more data.
Specify Heartbeat's observer {heartbeat-ref}/configuration-observer-options.html[geo options] to do so. Some examples might be:
For further redundancy, you may want to deploy multiple Heartbeats across geographic and network boundaries to provide more data.
To do so, specify Heartbeat's observer {heartbeat-ref}/configuration-observer-options.html[geo options].

Some examples might be:

* **A site served from a content delivery network (CDN) with points of presence (POPs) around the globe:**
In this case you may want to have multiple Heartbeat instances at different data centers around the world checking to see if your site is reachable via local CDN POPs.
To check if your site is reachable via CDN POPS, you may want to have multiple Heartbeat instances at different data centers around the world.
* **A service within a single data center that is accessed across multiple VPNs:**
Set up one Heartbeat instance within the VPN the service operates from, and another within an additional VPN that users access the service from.
Having both instances will help pinpoint network errors in the event of an outage.
Having both instances helps pinpoint network errors in the event of an outage.
* **A single service running primarily in a US east coast data center, with a hot failover located in a US west coast data center:**
In each data center, run a Heartbeat instance that checks both the local copy of the service and its counterpart across the country.
Set up two monitors in each region, one for the local service and one for the remote service.
In the event of a data center failure it will be immediately obvious if the service had a connectivity issue to the outside world or if the failure was only internal.
In the event of a data center failure it will be immediately apparent if the service had a connectivity issue to the outside world or if the failure was only internal.
Binary file added docs/uptime-guide/images/cert-exp.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
File renamed without changes
Binary file added docs/uptime-guide/images/create-alert.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
File renamed without changes
Binary file added docs/uptime-guide/images/indices.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
File renamed without changes
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
File renamed without changes
File renamed without changes
Binary file added docs/uptime-guide/images/tls-alert.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/uptime-guide/images/uptime-overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 10 additions & 2 deletions docs/uptime-guide/index.asciidoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
// short-version can be: 8, 7, 6, etc.
:short-version: 8

include::{asciidoc-dir}/../../shared/versions/stack/{source_branch}.asciidoc[]
include::{asciidoc-dir}/../../shared/attributes.asciidoc[]
Expand All @@ -12,3 +10,13 @@ include::install.asciidoc[]

include::deployment-arch.asciidoc[]

include::app-overview.asciidoc[]

include::monitor.asciidoc[]

include::settings.asciidoc[]

include::certificates.asciidoc[]

include::alerting.asciidoc[]

20 changes: 10 additions & 10 deletions docs/uptime-guide/install.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ first see the https://www.elastic.co/support/matrix[Elastic Support Matrix] for
[[install-elasticsearch]]
=== Step 1: Install Elasticsearch

Install an Elasticsearch cluster, start it up, and make sure it's running.
Install an {es} cluster, start it up, and make sure it's running.

. Verify that your system meets the
https://www.elastic.co/support/matrix#matrix_jvm[minimum JVM requirements] for {es}.
Expand All @@ -39,7 +39,7 @@ https://www.elastic.co/support/matrix#matrix_jvm[minimum JVM requirements] for {
[[install-kibana]]
=== Step 2: Install Kibana

Install Kibana, start it up, and open up the web interface:
Install {kib}, start it up, and open up the web interface:

. {stack-gs}/get-started-elastic-stack.html#install-kibana[Install Kibana].
. {stack-gs}/get-started-elastic-stack.html#_launch_the_kibana_web_interface[Launch the Kibana Web Interface].
Expand All @@ -48,27 +48,27 @@ Install Kibana, start it up, and open up the web interface:
=== Step 3: Install and configure Heartbeat

Uptime requires the setup of monitors in Heartbeat.
These monitors provide the data you'll be visualizing in the {kibana-ref}/xpack-uptime.html[Uptime UI].
These monitors provide the data you'll be visualizing in the {kibana-ref}/xpack-uptime.html[Uptime app].

See the *Setup Instructions* in Kibana for instructions on installing and configuring Heartbeat.
For instructions on installing and configuring Heartbeat, see the *Setup Instructions* in {kib}.
Additional information is available in {heartbeat-ref}/heartbeat-configuration.html[Configure Heartbeat].

[role="screenshot"]
image::images/uptime-setup.png[Installation instructions on the Uptime page in Kibana]

[[setup-security]]
=== Step 4: Setup Security
=== Step 4: Set up Security

Secure your installation by following the {heartbeat-ref}/securing-heartbeat.html[Secure Heartbeat] documentation.

[float]
==== Important considerations

* Make sure you're using the same major versions of Heartbeat and Kibana.
* Make sure you're using the same major versions of Heartbeat and {kib}.

* Index patterns tell Kibana which Elasticsearch indices you want to explore.
The Uptime UI requires a +heartbeat-{short-version}*+ index pattern.
If you have configured a different index pattern, you can use {ref}/indices-aliases.html[index aliases] to ensure data is recognized by the UI.
* Index patterns tell {kib} which {es} indices you want to explore.
The Uptime app requires a +heartbeat-{major-version-only}*+ index pattern.
If you have configured a different index pattern, you can use {ref}/indices-aliases.html[index aliases] to ensure data is recognized by the Uptime app.

After you install and configure Heartbeat,
the {kibana-ref}/xpack-uptime.html[Uptime UI] will automatically populate with the Heartbeat monitors.
the {kibana-ref}/xpack-uptime.html[Uptime app] is automatically populated with the Heartbeat monitors.
59 changes: 59 additions & 0 deletions docs/uptime-guide/monitor.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
[role="xpack"]
[[uptime-monitor]]
=== Monitor

The Monitor page helps you gain insights into the performance
of a specific network endpoint. A detailed visualization of
the monitor's request duration over time, as well as the `up`/`down`
status over time, is displayed. By configuring Machine Learning jobs
on this page, you can also also detect anomalies in response time data.


==== Status panel

The Status panel displays a quick summary of the latest information
regarding your monitor. You can view its latest status, click a link to
visit the targeted URL, see its most recent request duration, and determine the
amount of time that has elapsed since the last check.

When two Heartbeat instances are configured in different geographic locations
the map will show each location as a pinpoint on the map, along with the
amount of time elapsed since data was last received from that location.

[role="screenshot"]
image::images/status-bar.png[Status bar]


[float]
==== Monitor charts

The Monitor charts visualize information over the time specified in the
date range. These charts help you gain insights into how quickly requests are being resolved
by the targeted endpoint, and give you a sense of how frequently a host or endpoint
was down in your selected timespan.

[role="screenshot"]
image::images/monitor-charts.png[Monitor charts]

The Monitor duration chart displays request duration information for your monitor.
The area surrounding the line is the range of request time for the corresponding
bucket. The line is the average time. In the upper right hand of this panel
you can enable Anomaly detection using Machine Learning. When response times change
in an unexpected way the time range in which they occurred are highlighted with a color.

The pings over time chart is a graphical representation of the check statuses over time.
Hover over the charts to display crosshairs with specific numeric data.

[role="screenshot"]
image::images/crosshair-example.png[Chart crosshair]

[float]
==== Check history

The Check history table lists the total count of this monitor's checks for the selected
date range. To help find recent problems on a per-check basis, you can filter the checks
by status and location. This table can help you gain some insight into more granular details
about recent individual data points that Heartbeat is logging about your host or endpoint.

[role="screenshot"]
image::images/check-history.png[Check history view]
16 changes: 10 additions & 6 deletions docs/uptime-guide/overview.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,14 @@
[[uptime-overview]]
== Elastic Uptime overview

Elastic Uptime allows you to monitor the availability and response times of applications and services in real time and to detect problems before they affect users.
++++
<titleabbrev>Overview</titleabbrev>
++++

Elastic Uptime can help you to understand uptime and response time characteristics for your services and applications.
It can be deployed both inside and outside your organization's network, so you can analyze problems from multiple vantage points.
Elastic Uptime enables you to monitor the availability and response times of applications and services in real time and to detect problems before they affect users.

Elastic Uptime helps you to understand uptime and response time characteristics for your services and applications.
It can be deployed both inside and outside your organization's network, so that you can analyze problems from multiple vantage points.

Elastic Uptime uses these components: *Heartbeat*, *Elasticsearch* and *Kibana*.

Expand Down Expand Up @@ -37,17 +41,17 @@ The {kibana-ref}/xpack-uptime.html[Elasticsearch Uptime app] in Kibana provides
// ++ In diagram, should be Uptime app, not Uptime UI, possibly even Elastic Uptime? Also applies to Metrics/Logging/APM.
// ++ Need more whitespace around components.

image::images/uptime-simple-deployment.png[Uptime simple deployment]

In this simple deployment, a single instance of Heartbeat is deployed at a single monitoring location to monitor a single service.
The Heartbeat instance sends the monitoring data to Elasticsearch.
Then you can use the Uptime app in Kibana to view the data from Heartbeat and determine the status of the service.

image::images/uptime-multi-deployment.png[Uptime multiple server deployment]
image::images/uptime-simple-deployment.png[Uptime simple deployment]

In this deployment, two instances of Heartbeat are deployed at two different monitoring locations.
Both instances monitor the same service.
The Heartbeat instances send the monitoring data to Elasticsearch.
As before, you can use the Uptime app in Kibana to view the Heartbeat data and determine the status of the service.
When a failure occurs, the multiple monitoring locations enable you to pinpoint the area in which the failure has occurred.

image::images/uptime-multi-deployment.png[Uptime multiple server deployment]

Loading

0 comments on commit 18846c9

Please sign in to comment.