[META] Analysis, Results Aggregation, and Reporting #102

achitojha · 2022-01-06T16:30:03Z

OpenSearch Benchmark currently reports various performance metrics and creates a detailed report which can be published to an OpenSearch instance. The goal here is to:

Create aggregations across many of these metrics to provide a summary report.
Publish this report in a data store
Analysis : Report should have insights about the quality of test, if it was anomalous or the test ran successfully, etc.

Acceptance Criteria

We have a comprehensive dashboard that records the performance stats on regular basis that can be reproduced by anyone (Ex: using CDK to create the entire stack)
We can view associated PR's and other changes corresponding to a nightly build run
We can view aggregate stats of performance test runs at any point in time
We can analyze the raw stats corresponding to a specific performance run for deeper analysis at any point in time
The dashboard is updated automatically after ever performance test runs
The dashboard is accessible by everyone ( Read only access enabled for anonymous user )
Automated alerting is enabled to report on regressions / anomalies
Admin users have ability to create new dashboards / visualizations as needed ( integrate with OIDC ? )
The dashboard is maintained on regular basis ( up to date patches, upgrades etc..)

bbarani · 2023-02-08T22:40:43Z

We should create a comprehensive reporting engine that can provide aggregated summary for offline performance test runs along with build related performance summary ( including the merged PR's, commits etc..) as needed. We should be able to drill in and view the specifics along with automated monitoring and alerting support.

rishabh6788 · 2023-04-14T21:53:18Z

We are proposing a similar architecture that is in use to run nightly benchmarks, see here for existing logic.

Establish VPC peering between Jenkins VPC and testing account VPC in which the OS clusters will be spun up.
Use opensearch-cluster-cdk package to spin up multiple OS clusters with varying configurations. Currently it is being done using another repo which is private, supports only single-node cluster and doesn't provide flexibility to add new config at run time.
Use opensearch-benchmark docker image to run the OSB process on Jenkins agent node.
Like in the current scenario, immutable parameters such as network config, testing aws account and region for setting up the OS cluster will be fetched from S3 and the remaining parameters to configure the cluster will provided by user.
Use internal NLB to resolve OS cluster, this will give us advantage of running benchmarking against single and multi-node clusters. Internal NLBs can be resolved in a VPC peering scenario by allowing CIDR of the peered VPC in target ec2 security group.
Use managed OS cluster as datastore to store performance metrics. These metrics are emitted by OSB coordinator node running on Jenkins agent node, therefore, the OS cluster needs to be accessible by the agent node.
Proposing to spin up the managed OS cluster in jenkins infra account and in the same VPC as Jenkins infra so that it is easily accessible by agent node.
Once the nightly performance benchmark data starts flowing, we will work with Dashboards and UX team to create analysis and dashboards on performance run metrics and open it to public with read-only access to view and compare against their benchmarking runs.

rishabh6788 · 2023-06-09T17:55:52Z

The P0 iteration of the product has been released and the dashboards are now available at opensearch.org/benchmarks.
The features include:

Run performance tests against single-node and multi-node.
Parameters such as 50% heap usage and # of data, master and ml nodes are now configurable.
Experimental or additional features such as segment replication and remote store now possible to benchmark nightly.
Add additional metadata tags to performance metrics to generate dedicated visualizations for different use-cases.
The datastore is a self-managed OpenSearch cluster with dashboards server exposed to public.
Admins can create and update dashboards.
Anonymous access is enabled to for anyone to have read access to the dashboards and metric data for further analysis.

rishabh6788 · 2023-06-13T18:28:39Z

P1 goals:

Add alerting to track performance regressions.
Add support to run nightly benchmark against min distribution of OpenSearch.
Add support to choose ec2 instance type for data nodes and ml-nodes.
Annotate commits to dashboards data points.

rishabh6788 · 2023-06-20T22:20:49Z

Closing this issue. Will create issues for each of the individual tasks mentioned above.

bbarani changed the title ~~Analysis, Results Aggregation, and Reporting~~ [META] Analysis, Results Aggregation, and Reporting Feb 10, 2023

bbarani added this to OpenSearch Engineering Effectiveness Feb 10, 2023

bbarani moved this to Not started in OpenSearch Engineering Effectiveness Feb 10, 2023

bbarani added the enhancement New feature or request label Feb 10, 2023

bbarani mentioned this issue Feb 13, 2023

Onboard opensearch-benchmarks for performance testing opensearch-project/opensearch-build#3100

Closed

4 tasks

bbarani assigned rishabh6788 Mar 6, 2023

rishabh6788 mentioned this issue Apr 18, 2023

Workflow to run performance tests using opensearch-benchmark opensearch-project/opensearch-build#3415

Merged

gkamat moved this from Not started to In Progress in OpenSearch Engineering Effectiveness Apr 18, 2023

anasalkouz mentioned this issue May 10, 2023

[RFC] OpenSearch Performance Testing Proposal opensearch-project/OpenSearch#7499

Closed

rishabh6788 closed this as completed Jun 20, 2023

github-project-automation bot moved this from In Progress to Done in OpenSearch Engineering Effectiveness Jun 20, 2023

rishabh6788 mentioned this issue Nov 29, 2023

[RFC] Control plane for submitting benchmark runs. opensearch-project/opensearch-build#4231

Open

rishabh6788 mentioned this issue Aug 2, 2024

[META] Automate performance benchmark for OpenSearch and components opensearch-project/opensearch-build#4906

Open

6 tasks

github-project-automation bot added this to OpenSearch Benchmark Roadmap Aug 30, 2024

github-project-automation bot moved this to Completed in OpenSearch Benchmark Roadmap Aug 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[META] Analysis, Results Aggregation, and Reporting #102

[META] Analysis, Results Aggregation, and Reporting #102

achitojha commented Jan 6, 2022 •

edited by rishabh6788

Loading

bbarani commented Feb 8, 2023

rishabh6788 commented Apr 14, 2023 •

edited by bbarani

Loading

rishabh6788 commented Jun 9, 2023

rishabh6788 commented Jun 13, 2023

rishabh6788 commented Jun 20, 2023

[META] Analysis, Results Aggregation, and Reporting #102

[META] Analysis, Results Aggregation, and Reporting #102

Comments

achitojha commented Jan 6, 2022 • edited by rishabh6788 Loading

Acceptance Criteria

bbarani commented Feb 8, 2023

rishabh6788 commented Apr 14, 2023 • edited by bbarani Loading

rishabh6788 commented Jun 9, 2023

rishabh6788 commented Jun 13, 2023

rishabh6788 commented Jun 20, 2023

achitojha commented Jan 6, 2022 •

edited by rishabh6788

Loading

rishabh6788 commented Apr 14, 2023 •

edited by bbarani

Loading