Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Enable to see Shard* Metrics in performance-analyzer API #333

Closed
craph opened this issue Nov 10, 2022 · 5 comments
Closed

[BUG] Enable to see Shard* Metrics in performance-analyzer API #333

craph opened this issue Nov 10, 2022 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@craph
Copy link

craph commented Nov 10, 2022

Hello,

I'm opening this issue because I think I have a weird behaviour with performance-analyzer.

Previously, in opendistro-for-elasticsearch:1.13.3 just after starting a docker container and enabling performance-analyzer plugin I was able to see metrics.

What is the bug?
Just after I've setup opensearch with docker for benchmark, I'm unable to visualize the Shard* Metrics with PerformanceAnalyzer.

How can one reproduce the bug?
Steps to reproduce the behavior:

  1. Create a docker-compose.yml file with this content
version: '3'
services:
  opensearch-node1: # This is also the hostname of the container within the Docker network (i.e. https://opensearch-node1/)
    image: opensearchproject/opensearch:latest # Specifying the latest available image - modify if you want a specific version
    container_name: opensearch-node1
    environment:
      - cluster.name=opensearch-cluster # Name the cluster
      - node.name=opensearch-node1 # Name the node that will run in this container
      - discovery.type=single-node
      - plugins.security.ssl.http.enabled=false
      - bootstrap.memory_lock=true # Disable JVM heap memory swapping
      - "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m" # Set min and max JVM heap sizes to at least 50% of system RAM
    ulimits:
      memlock:
        soft: -1 # Set memlock to unlimited (no soft or hard limit)
        hard: -1
      nofile:
        soft: 65536 # Maximum number of open files for the opensearch user - set to at least 65536
        hard: 65536
    ports:
      - 9200:9200 # REST API
      - 9600:9600 # Performance Analyzer
    networks:
      - opensearch-net # All of the containers will join the same Docker bridge network
  opensearch-dashboards:
    image: opensearchproject/opensearch-dashboards:latest # Make sure the version of opensearch-dashboards matches the version of opensearch installed on other nodes
    container_name: opensearch-dashboards
    ports:
      - 5601:5601 # Map host port 5601 to container port 5601
    expose:
      - "5601" # Expose port 5601 for web access to OpenSearch Dashboards
    environment:
      OPENSEARCH_HOSTS: '["http://opensearch-node1:9200","http://opensearch-node2:9200"]' # Define the OpenSearch nodes that OpenSearch Dashboards will query
    networks:
      - opensearch-net

volumes:
  opensearch-data1:

networks:
  opensearch-net:
  1. Start the containers with the command : docker-compose up -d
  2. Then enable the plugin performance-analyzer like it's mentionned here : https://opensearch.org/docs/2.3/monitoring-plugins/pa/index/#install-performance-analyzer
curl -XPOST localhost:9200/_plugins/_performanceanalyzer/cluster/config -H 'Content-Type: application/json' -d '{"enabled": true}' -u 'admin:admin'
  1. Normally at this steps the performance analyzer should be up and running
  2. Create an index and documents data with the Documents API bulk operation
POST _bulk
{ "index": { "_index": "movies", "_id": "tt1979320" } }
{ "title": "Rush", "year": 2013 }
{ "index": { "_index": "movies", "_id": "tt1979322" } }
{ "title": "Rush2", "year": 2013 }
{ "index": { "_index": "movies", "_id": "tt1979323" } }
{ "title": "Rush3", "year": 2013 }
{ "index": { "_index": "movies", "_id": "tt1979324" } }
{ "title": "Rush4", "year": 2013 }
{ "index": { "_index": "movies", "_id": "tt1979325" } }
{ "title": "Rush5", "year": 2013 } 
  1. Here is the output of the creation
{"took":187,"errors":false,"items":[{"index":{"_index":"movies","_id":"tt1979320","_version":1,"result":"created","forced_refresh":true,"_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1,"status":201}},{"index":{"_index":"movies","_id":"tt1979322","_version":1,"result":"created","forced_refresh":true,"_shards":{"total":2,"successful":1,"failed":0},"_seq_no":1,"_primary_term":1,"status":201}},{"index":{"_index":"movies","_id":"tt1979323","_version":1,"result":"created","forced_refresh":true,"_shards":{"total":2,"successful":1,"failed":0},"_seq_no":2,"_primary_term":1,"status":201}},{"index":{"_index":"movies","_id":"tt1979324","_version":1,"result":"created","forced_refresh":true,"_shards":{"total":2,"successful":1,"failed":0},"_seq_no":3,"_primary_term":1,"status":201}},{"index":{"_index":"movies","_id":"tt1979325","_version":1,"result":"created","forced_refresh":true,"_shards":{"total":2,"successful":1,"failed":0},"_seq_no":4,"_primary_term":1,"status":201}}]}
  1. Then try to get the Shard* Metric from performance-analyzer like this
GET localhost:9600/_plugins/_performanceanalyzer/metrics?metrics=ShardEvents,ShardBulkDocs,CPU_Utilization&dim=Operation&nodes=all&agg=sum,sum,max
  1. Here is the output
{
    "90CFxK7zTi28rLkUpQiSpA": {
        "timestamp": 1668067125000,
        "data": {
            "fields": [
                {
                    "name": "Operation",
                    "type": "VARCHAR"
                },
                {
                    "name": "ShardEvents",
                    "type": "DOUBLE"
                },
                {
                    "name": "ShardBulkDocs",
                    "type": "DOUBLE"
                },
                {
                    "name": "CPU_Utilization",
                    "type": "DOUBLE"
                }
            ],
            "records": [
                [
                    "GC",
                    null,
                    null,
                    0.0
                ],
                [
                    "flush",
                    null,
                    null,
                    0.0
                ],
                [
                    "generic",
                    null,
                    null,
                    0.0
                ],
                [
                    "get",
                    null,
                    null,
                    0.0
                ],
                [
                    "management",
                    null,
                    null,
                    0.0
                ],
                [
                    "other",
                    null,
                    null,
                    0.0020006279656433657
                ],
                [
                    "refresh",
                    null,
                    null,
                    0.0
                ],
                [
                    "search",
                    null,
                    null,
                    0.0
                ],
                [
                    "shardbulk",
                    null,
                    null,
                    0.0
                ],
                [
                    "transportWorker",
                    null,
                    null,
                    0.0
                ],
                [
                    "write",
                    null,
                    null,
                    0.0
                ]
            ]
        }
    }
}

What is the expected behavior?
I guess I should be able to see something instead of null values for the Shard metrics.
Do you have any advise how to proceed to be able to view those metrics ?

Example in opendistro-for-elasticsearch:1.13.3 I have this output

GET http://localhost:9600/_opendistro/_performanceanalyzer/metrics?metrics=Latency,CPU_Utilization,ShardBulkDocs,ShardEvents&agg=avg,max,sum,sum&dim=Operation&nodes=all
{
    "7LlEWVNVSL28JkYmWc8YqA": {
        "timestamp": 1668075670000,
        "data": {
            "fields": [
                {
                    "name": "Operation",
                    "type": "VARCHAR"
                },
                {
                    "name": "Latency",
                    "type": "DOUBLE"
                },
                {
                    "name": "CPU_Utilization",
                    "type": "DOUBLE"
                },
                {
                    "name": "ShardBulkDocs",
                    "type": "DOUBLE"
                },
                {
                    "name": "ShardEvents",
                    "type": "DOUBLE"
                }
            ],
            "records": [
                [
                    "GC",
                    null,
                    0.0,
                    null,
                    null
                ],
                [
                    "bulk",
                    10.5,
                    null,
                    null,
                    null
                ],
                [
                    "flush",
                    null,
                    0.0,
                    null,
                    null
                ],
                [
                    "generic",
                    null,
                    0.0,
                    null,
                    null
                ],
                [
                    "management",
                    null,
                    0.0,
                    null,
                    null
                ],
                [
                    "other",
                    null,
                    0.0019996000799840027,
                    null,
                    null
                ],
                [
                    "refresh",
                    null,
                    0.0,
                    null,
                    null
                ],
                [
                    "shardbulk",
                    10.0,
                    0.0,
                    2.0,
                    2.0
                ]
            ]
        }
    }
}

What is your host/environment?

  • OS: docker
  • Version opensearch latest version 2.3.0
  • Plugins embedded by default in the docker image

Thank you very much for your help.
Best regards,

@craph craph added bug Something isn't working untriaged labels Nov 10, 2022
@craph
Copy link
Author

craph commented Nov 10, 2022

@dblock is it something that you are already aware of ? 🤔

@dblock
Copy link
Member

dblock commented Nov 11, 2022

@craph I am not, let's move this to the performance-analyzer repo.

@dblock dblock transferred this issue from opensearch-project/OpenSearch Nov 11, 2022
@craph
Copy link
Author

craph commented Nov 14, 2022

Thank you very much @dblock for transferring the issue.
But is it really a performance-analyzer repo issue and not an opensearch issue ?

As I mentionned in the issue, with OpenDistro 1.13.3 I have seen the same behavior / issue. And in performance-analyzer or performance-analyzer-rca I haven't see any major changes. I noticed that all the namespaces has changed because of migrating from OpenDistro -> Openseach... So I has guessing changes in opensearch...

As a clue, If I comment the where clause in the method fetchLatency() from https://github.com/opensearch-project/performance-analyzer-rca/blob/main/src/main/java/org/opensearch/performanceanalyzer/reader/ShardRequestMetricsSnapshot.java#L179
I'm able to see some values so I suppose it's showing that the issue is elsewhere ? I'm not really sure.

I don't know how to debug more in opensearch of performance-analyzer plugin.

Do you have in advise to give me ?

Thank you very much.

@craph
Copy link
Author

craph commented Nov 15, 2022

@dblock it looks like the initial code moved to the repo performance-analyzer-rca now and has been reworked from what it was initially.

Maybe an issue with the rework

@sgup432
Copy link
Contributor

sgup432 commented Mar 14, 2023

This was being taken care as part of this PR - opensearch-project/performance-analyzer-rca#283.
Closing this issue

@sgup432 sgup432 closed this as completed Mar 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants