Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds Jaeger trace data for analytics documentation #2374

Merged
merged 35 commits into from
Jan 17, 2023
Merged
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
d5c71a5
for new page how to analyze Jaeger trace data
alicejw1 Jan 11, 2023
9c5be21
remove old image
alicejw1 Jan 11, 2023
660c5e3
for new information and doc writer checklist
alicejw1 Jan 11, 2023
86cf05e
for new information and doc writer checklist
alicejw1 Jan 11, 2023
4bfa459
small rewrite
alicejw1 Jan 11, 2023
01e30ff
new clean images from Dashboards URL directly
alicejw1 Jan 11, 2023
cdd04d6
for additional information
alicejw1 Jan 11, 2023
c320d66
remove blank lines
alicejw1 Jan 11, 2023
b1c1551
for tech review feedback updates
alicejw1 Jan 11, 2023
87241b7
add requirements section
alicejw1 Jan 11, 2023
35f3784
for new procedure
alicejw1 Jan 11, 2023
3c4b4d7
for tech review feedback updates
alicejw1 Jan 11, 2023
1fd987b
continued updates
alicejw1 Jan 12, 2023
0bac730
for docker compose file instructions
alicejw1 Jan 12, 2023
d1b5c5a
for docker usage instruction
alicejw1 Jan 12, 2023
2af0f4e
for step 2 view dashboards
alicejw1 Jan 12, 2023
3cbbc61
for additional link provided in tech review
alicejw1 Jan 12, 2023
1d34e2d
for link to index page to introduce the feature
alicejw1 Jan 12, 2023
7ed601b
final checklist
alicejw1 Jan 12, 2023
ae75639
add warning not to use sample file in prod env
alicejw1 Jan 12, 2023
5220a61
updated docker file that is safe for prod env, remove warning note fo…
alicejw1 Jan 12, 2023
8042c96
for small update to parent page
alicejw1 Jan 12, 2023
0bd55fb
for tech review
alicejw1 Jan 12, 2023
f6a9757
typo fix for font
alicejw1 Jan 12, 2023
6fb5d02
for doc review #1 feedback updates
alicejw1 Jan 12, 2023
7e14e4d
for doc review feedback #2 updates
alicejw1 Jan 12, 2023
2b18d99
for a couple minor changes
alicejw1 Jan 12, 2023
fa8efe1
spell out dashboard URI directly to trace analytics for accessibility…
alicejw1 Jan 12, 2023
cfd0152
need to add additional step from eng to generate sample data
alicejw1 Jan 12, 2023
9bd9903
for additional step image of sample app
alicejw1 Jan 12, 2023
85cae77
rename step numbers
alicejw1 Jan 12, 2023
36e2b98
minor fix heading levels
alicejw1 Jan 12, 2023
360eba6
updates recommended by the editorial reviewer
alicejw1 Jan 17, 2023
7763331
clarify Spans window function
alicejw1 Jan 17, 2023
cc6f1fa
clarified individual trace details section
alicejw1 Jan 17, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions _observability-plugin/trace/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,9 @@ A single operation, such as a user clicking a button, can trigger an extended se
Trace Analytics can help you visualize this flow of events and identify performance problems.

![Detailed trace view]({{site.url}}{{site.baseurl}}/images/ta-trace.png)
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

## Trace Analytics with Jaeger data
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

The Trace Analytics functionality in the OpenSearch Observability plugin now supports Jaeger trace data. If you use OpenSearch as the backend for Jaeger trace data, you can use the Trace Analytics built-in analysis capabilities.
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

To set up your environment to perform Trace Analytics, see [Analyze Jaeger trace data]({{site.url}}{{site.baseurl}}/observability-plugin/trace/trace-analytics-jaeger/).
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved
240 changes: 240 additions & 0 deletions _observability-plugin/trace/trace-analytics-jaeger.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,240 @@
---
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved
layout: default
title: Analyze Jaeger trace data
parent: Trace analytics
nav_order: 55
---

# Analyze Jaeger trace data
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

Introduced 2.5
{: .label .label-purple }
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

The Trace analytics functionality in the OpenSearch Observability plugin now supports Jaeger trace data. If you use OpenSearch as the backend for Jaeger trace data, you can use the Trace analytics built-in analysis capabilities. This provides support for OpenTelemetry (OTEL) formatted trace data.
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

When you perform trace analytics, you can select from two data sources:

- **Data Prepper** – Data ingested into OpenSearch through Data Prepper.
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved
- **Jaeger** – Trace data stored within OpenSearch as its backend.
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

If you currently store your Jaeger trace data in OpenSearch, you can now use the capabilities built into Trace Analytics to analyze the error rates and latencies You can also filter the traces and look into the span details of a trace to pinpoint any service issues.
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

When you ingest Jaeger data into OpenSearch, it gets stored in a different index than the OTA-generated index that gets created when you run data through the Data Prepper. You can indicate which data source on which you want to perform trace analytics with the data source selector in the Dashboards.
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

Jaeger trace data that you can analyze includes span data, as well as service and operation endpoint data. Jaeger span data analysis requires some configuration.
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

By default, each time you ingest data for Jaeger, it creates a separate index for that day.

To learn more about Jaeger data tracing, see the [Jaeger](https://www.jaegertracing.io/) open source documentation.
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

## Data Ingestion Requirements
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

To use Trace Analytics with Jaeger data, you need to configure error capability for use with trace analytics.
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

Jaeger data that is ingested for OpenSearch needs to have the environment variable `ES_TAGS_AS_FIELDS_ALL` set to `true` for errors. If data is not ingested in this format it will not work for errors and error data will not be available for traces in trace analytics with OpenSearch.
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

### About Data ingestion with Jaeger indexes
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

Trace analytics for non-Jaeger data use OTEL indexes with the naming conventions `otel-v1-apm-span-*` or `otel-v1-apm-service-map*`.
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

Jaeger indexes follow the naming conventions `jaeger-span-*` or `jaeger-service-*`.

## How to set up OpenSearch to use Jaeger data
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

We provide a sample Docker compose file that contains the required configurations.

### Step 1: Run the Docker compose file
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

Use the following Docker compose file to enable Jaeger data for trace analytics with the `ES_TAGS_AS_FIELDS_ALL` environment variable set to `true` to enable errors to be added to trace data. Copy the following Docker compose file contents and save it as `docker-compose.yml`.

```
version: '3'
services:
opensearch-node1: # This is also the hostname of the container within the Docker network (i.e. https://opensearch-node1/)
image: opensearchproject/opensearch:latest # Specifying the latest available image - modify if you want a specific version
container_name: opensearch-node1
environment:
- cluster.name=opensearch-cluster # Name the cluster
- node.name=opensearch-node1 # Name the node that will run in this container
- discovery.seed_hosts=opensearch-node1,opensearch-node2 # Nodes to look for when discovering the cluster
- cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2 # Nodes eligible to serve as cluster manager
- bootstrap.memory_lock=true # Disable JVM heap memory swapping
- "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m" # Set min and max JVM heap sizes to at least 50% of system RAM
ulimits:
memlock:
soft: -1 # Set memlock to unlimited (no soft or hard limit)
hard: -1
nofile:
soft: 65536 # Maximum number of open files for the opensearch user - set to at least 65536
hard: 65536
volumes:
- opensearch-data1:/usr/share/opensearch/data # Creates volume called opensearch-data1 and mounts it to the container
ports:
- "9200:9200"
- "9600:9600"
networks:
- opensearch-net # All of the containers will join the same Docker bridge network

opensearch-node2:
image: opensearchproject/opensearch:latest # This should be the same image used for opensearch-node1 to avoid issues
container_name: opensearch-node2
environment:
- cluster.name=opensearch-cluster
- node.name=opensearch-node2
- discovery.seed_hosts=opensearch-node1,opensearch-node2
- cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2
- bootstrap.memory_lock=true
- "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
volumes:
- opensearch-data2:/usr/share/opensearch/data
networks:
- opensearch-net
opensearch-dashboards:
image: opensearchproject/opensearch-dashboards:latest # Make sure the version of opensearch-dashboards matches the version of opensearch installed on other nodes
container_name: opensearch-dashboards
ports:
- 5601:5601 # Map host port 5601 to container port 5601
expose:
- "5601" # Expose port 5601 for web access to OpenSearch Dashboards
environment:
OPENSEARCH_HOSTS: '["https://opensearch-node1:9200","https://opensearch-node2:9200"]' # Define the OpenSearch nodes that OpenSearch Dashboards will query
networks:
- opensearch-net

jaeger-collector:
image: jaegertracing/jaeger-collector:latest
ports:
- "14269:14269"
- "14268:14268"
- "14267:14267"
- "14250:14250"
- "9411:9411"
networks:
- opensearch-net
restart: on-failure
environment:
- SPAN_STORAGE_TYPE=opensearch
- ES_TAGS_AS_FIELDS_ALL=true
- ES_USERNAME=admin
- ES_PASSWORD=admin
- ES_TLS_SKIP_HOST_VERIFY=true
command: [
"--es.server-urls=https://opensearch-node1:9200",
"--es.tls.enabled=true",
]
depends_on:
- opensearch-node1

jaeger-agent:
image: jaegertracing/jaeger-agent:latest
hostname: jaeger-agent
command: ["--reporter.grpc.host-port=jaeger-collector:14250"]
ports:
- "5775:5775/udp"
- "6831:6831/udp"
- "6832:6832/udp"
- "5778:5778"
networks:
- opensearch-net
restart: on-failure
environment:
- SPAN_STORAGE_TYPE=opensearch
depends_on:
- jaeger-collector

hotrod:
image: jaegertracing/example-hotrod:latest
ports:
- "8080:8080"
command: ["all"]
environment:
- JAEGER_AGENT_HOST=jaeger-agent
- JAEGER_AGENT_PORT=6831
networks:
- opensearch-net
depends_on:
- jaeger-agent

volumes:
opensearch-data1:
opensearch-data2:

networks:
opensearch-net:
```

### Step 2: Start the cluster

Run the following command to deploy the Docker compose YAML file.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Run the following command to deploy the Docker compose YAML file.
Run the following command to deploy the Docker Compose YAML file:


```
docker compose up -d
```
To stop the cluster, run the following command:

```
docker compose down
```

## Step 2: View trace data in OpenSearch Dashboards

After you generate Jaeger trace data you can go to OpenSearch Dashboards to view your trace data.
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

Go to OpenSearch Dashboards Trace Analytics at [Trace Analytics](http://localhost:5601/app/observability-dashboards#/trace_analytics/home).
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved


## Use trace analytics in OpenSearch Dashboards
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

To analyze your Jaeger trace data in the Dashboards, you need to set up Trace Analytics first. To get started, see [Get started with Trace Analytics]({{site.url}}{{site.baseurl}}/observability-plugin/trace/get-started/).
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

### Data sources

You can specify either Data Prepper or Jaeger as your data source when you perform trace analytics.
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved
From the OpenSearch Dashboards, go to **Observability > Trace Analytics** and select Jaeger.
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

![Select data source]({{site.url}}{{site.baseurl}}/images/trace-analytics/select-data.png)

## Dashboard view

After you select Jaeger for the data source, you can view all of your indexed data in **Dashboard** view including **Error rate** and **Throughput**.
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

### Error rate

You can view the Trace error count over time in the Dashboard, and also view the top five combinations of services and operations that have a non-zero error rate.
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

![Error rate]({{site.url}}{{site.baseurl}}/images/trace-analytics/error-rate.png)

### Throughput

With **Throughput** selected, you can see the throughput of traces on Jaeger indexes that are coming in over time.
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

You can select an individual Trace from **Top 5 Service and Operation Latency** list and view the detailed trace data.
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

![Throughput]({{site.url}}{{site.baseurl}}/images/trace-analytics/throughput.png)

You can also see the combinations of services and operations that have the highest latency.

If you select one of the entries for Service and Operation Name and go to the **Traces** column to select a trace, it will add the service and operation as filters for you.
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

## Traces

In **Traces**, you can see the latency and errors for the filtered service and operation for each individual Trace ID in the list.
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

![Select data source]({{site.url}}{{site.baseurl}}/images/trace-analytics/service-trace-data.png)

If you select an individual Trace ID, you can see more detailed information about the trace, such as time spent by the service and each span for the service and operation. You can also view the payload that you get from the index in JSON format.
alicejw1 marked this conversation as resolved.
Show resolved Hide resolved

![Select data source]({{site.url}}{{site.baseurl}}/images/trace-analytics/trace-details.png)

## Services

You can also look at individual error rates and latency for each individual service. Go to **Observability > Trace Analytics > Services**. In **Services**, you can see the average latency, error rate, throughput and trace for each service in the list.

![Services list]({{site.url}}{{site.baseurl}}/images/trace-analytics/services-jaeger.png)
Binary file added images/trace-analytics/error-rate.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/trace-analytics/select-data.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/trace-analytics/service-trace-data.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/trace-analytics/services-jaeger.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/trace-analytics/throughput.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/trace-analytics/trace-details.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.