Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding documentation for running sample workloads #389

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
84 changes: 80 additions & 4 deletions DEVELOPER_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,13 @@ This document will walk you through on what's needed to start contributing code
- **Pyenv** : Install `pyenv` and follow the instructions in the output of `pyenv init` to set up your shell and restart it before proceeding.
For more details please refer to the [PyEnv installation instructions](https://github.com/pyenv/pyenv#installation).

Install the following modules to continue with the next steps:
```
sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev \
libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev \
xz-utils tk-dev libffi-dev liblzma-dev git
```

- **JDK**: Although OSB is a Python application, it optionally builds and provisions OpenSearch clusters. JDK version 17 is used to build the current version of OpenSearch. Please refer to the [build setup requirements](https://github.com/opensearch-project/OpenSearch/blob/ca564fd04f5059cf9e3ce8aba442575afb3d99f1/DEVELOPER_GUIDE.md#install-prerequisites).
Note that the `javadoc` executable should be available in the JDK installation. An earlier version of the JDK can be used, but not all the integration tests will pass.

Expand All @@ -38,7 +45,9 @@ This document will walk you through on what's needed to start contributing code

### Setup

To develop OSB properly, it is recommended that you fork the official OpenSearch Benchmark repository.
To develop OSB properly, it is recommended that you fork the official OpenSearch Benchmark repository.

For those working on WSL2, it is recommended to clone the repository and set up the working environment within the Linux subsystem. Refer to the guide for setting up WSL2 on [Visual Studio Code](https://code.visualstudio.com/docs/remote/wsl) or [PyCharm](https://www.jetbrains.com/help/pycharm/using-wsl-as-a-remote-interpreter.html#create-wsl-interpreter).

After you git cloned the forked copy of OpenSearch Benchmark, use the following command-line instructions to set up OpenSearch Benchmark for development:
```
Expand Down Expand Up @@ -74,6 +83,74 @@ This is typically created in PyCharm IDE by visiting the `Python Interpreter`, s
`
In order to run tests within the PyCharm IDE, ensure the `Python Integrated Tools` / `Testing` / `Default Test Runner` is set to `pytest`.

## Running Workloads

### Installation

Download the latest release of OpenSearch from https://opensearch.org/downloads.html. If you are using WSL, make sure to download it into your `/home/<user>` directory instead of `/mnt/c`.
```
wget https://artifacts.opensearch.org/releases/bundle/opensearch/<x.x.x>/opensearch-<x.x.x>-linux-x64.tar.gz
tar -xf opensearch-x.x.x-linux-x64.tar.gz
cd opensearch-x.x.x
```
NOTE: Have Docker running in the background for the next steps. Refer to the installation instructions [here](https://docs.docker.com/compose/install/).

### Setup

Add the following settings to the `opensearch.yml` file under the config directory
```
vim config/opensearch.yml
```
```
#
discovery.type: single-node
plugins.security.disabled: true
#
```
Run the opensearch-tar-install.sh script to install and setup a cluster for our use.
```
bash opensearch-tar-install.sh
```
Check the output of `curl.exe "http://localhost:9200/_cluster/health?pretty"`. Output should be similar to this:
```
{
"cluster_name" : "<name>",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"discovered_master" : true,
"discovered_cluster_manager" : true,
"active_primary_shards" : 3,
"active_shards" : 3,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
```
Now, you have a local cluster running! You can connect to this and run the workload for the next step.

### Running the workload

Here's a sample executation of the geonames benchmark which can be found from the [workloads](https://github.com/opensearch-project/opensearch-benchmark-workloads) repo.
```
opensearch-benchmark execute-test --pipeline=benchmark-only --workload=geonames --target-host=127.0.0.1:9200 --test-mode --workload-params '{"number_of_shards":"1","number_of_replicas":"0"}'
```

And we're done! You should be seeing the performance metrics soon enough!

### Debugging

**If you are not seeing any results, it should be an indicator that there is an issue with your cluster setup or the way the manager is accessing it**. Use the command below to view the logs.
```
tail -f ~/.bencmark/logs/bechmark.log
```

## Executing tests

Once setup is complete, you may run the unit and integration tests.
Expand All @@ -87,10 +164,10 @@ make test

### Integration Tests

Integration tests can be run on the following operating systems:
Integration tests are expected to run for approximately **20-30 mins** and can be run on the following operating systems:
* RedHat
* CentOS
* Ubuntu
* Ubuntu (and WSL)
* Amazon Linux 2
* MacOS

Expand All @@ -100,7 +177,6 @@ Invoke integration tests by running the following command within the root direct
make it
```

Integration tests are expected to run for approximately 20-30 mins.

## Submitting your changes for a pull request

Expand Down
38 changes: 0 additions & 38 deletions it/generate_test.py

This file was deleted.

4 changes: 0 additions & 4 deletions osbenchmark/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,6 @@
import sys
import urllib

import pkg_resources

__version__ = pkg_resources.require("opensearch-benchmark")[0].version

# Allow an alternative program name be set in case Benchmark is invoked a wrapper script
PROGRAM_NAME = os.getenv("BENCHMARK_ALTERNATIVE_BINARY_NAME", os.path.basename(sys.argv[0]))

Expand Down
38 changes: 2 additions & 36 deletions osbenchmark/benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
from osbenchmark import PROGRAM_NAME, BANNER, FORUM_LINK, SKULL, check_python_version, doc_link, telemetry
from osbenchmark import version, actor, config, paths, \
test_execution_orchestrator, results_publisher, \
metrics, workload, chart_generator, exceptions, log
metrics, workload, exceptions, log
from osbenchmark.builder import provision_config, builder
from osbenchmark.workload_generator import workload_generator
from osbenchmark.utils import io, convert, process, console, net, opts, versions
Expand Down Expand Up @@ -188,30 +188,6 @@ def add_workload_source(subparser):
help="Map of index name and integer doc count to extract. Ensure that index name also exists in --indices parameter. " +
"To specify several indices and doc counts, use format: <index1>:<doc_count1> <index2>:<doc_count2> ...")

generate_parser = subparsers.add_parser("generate", help="Generate artifacts")
generate_parser.add_argument(
"artifact",
metavar="artifact",
help="The artifact to create. Possible values are: charts",
choices=["charts"])
# We allow to either have a chart-spec-path *or* define a chart-spec on the fly
# with workload, test_procedure and provision_config_instance. Convincing
# argparse to validate that everything is correct *might* be doable but it is
# simpler to just do this manually.
generate_parser.add_argument(
"--chart-spec-path",
required=True,
help="Path to a JSON file(s) containing all combinations of charts to generate. Wildcard patterns can be used to specify "
"multiple files.")
generate_parser.add_argument(
"--chart-type",
help="Chart type to generate (default: time-series).",
choices=["time-series", "bar"],
default="time-series")
generate_parser.add_argument(
"--output-path",
help="Output file name (default: stdout).",
default=None)

compare_parser = subparsers.add_parser("compare", help="Compare two test_executions")
compare_parser.add_argument(
Expand Down Expand Up @@ -600,7 +576,7 @@ def add_workload_source(subparser):
default=False)

for p in [list_parser, test_execution_parser, compare_parser, download_parser, install_parser,
start_parser, stop_parser, info_parser, generate_parser, create_workload_parser]:
start_parser, stop_parser, info_parser, create_workload_parser]:
# This option is needed to support a separate configuration for the integration tests on the same machine
p.add_argument(
"--configuration-name",
Expand Down Expand Up @@ -742,11 +718,6 @@ def with_actor_system(runnable, cfg):
console.warn("Could not terminate all internal processes within timeout. Please check and force-terminate "
"all Benchmark processes.")


def generate(cfg):
chart_generator.generate(cfg)


def configure_telemetry_params(args, cfg):
cfg.add(config.Scope.applicationOverride, "telemetry", "devices", opts.csv_to_list(args.telemetry))
cfg.add(config.Scope.applicationOverride, "telemetry", "params", opts.to_dict(args.telemetry_params))
Expand Down Expand Up @@ -906,11 +877,6 @@ def dispatch_sub_command(arg_parser, args, cfg):
configure_results_publishing_params(args, cfg)

execute_test(cfg, args.kill_running_processes)
elif sub_command == "generate":
cfg.add(config.Scope.applicationOverride, "generator", "chart.spec.path", args.chart_spec_path)
cfg.add(config.Scope.applicationOverride, "generator", "chart.type", args.chart_type)
cfg.add(config.Scope.applicationOverride, "generator", "output.path", args.output_path)
generate(cfg)
elif sub_command == "create-workload":
cfg.add(config.Scope.applicationOverride, "generator", "indices", args.indices)
cfg.add(config.Scope.applicationOverride, "generator", "number_of_docs", args.number_of_docs)
Expand Down
Loading