Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add scripts to fully automate debezium/perf testing. #2109

Merged
merged 8 commits into from
Mar 22, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions debezium/demo/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ services:
service: server
volumes:
- ../scripts:/scripts
- ./logs:/logs

web:
extends:
Expand Down
1 change: 1 addition & 0 deletions debezium/perf/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
logs
135 changes: 127 additions & 8 deletions debezium/perf/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
Debezium - Kafka Perf
=====================
# Debezium - Kafka Perf

The docker compose file in this directory is similar to
the one in the ../demo directory, with additional
Expand All @@ -19,8 +18,130 @@ analysis oriented, it has considerably
larger requirements than our other
feature-oriented demos.

Once the compose is running
===========================
## Additional building steps

On top of what is required for `../demo` (see
`../demo/README.md`), the automated testing
requires building the Deephaven Java client examples.
At the toplevel directory of your git clone (`../..`), run:

```
./gradlew java-client-session-examples:installDist
```

## Automated testing

The script `run_experiment.sh` in this directory performs a
full test for one engine (either Deephaven or Materialize).
It will:

- Start the containers required for a particular run (and only those).
- Ensure container logs are preserved for the run.
- Load the demo code in the respective engine and sample update delays to a log file.
- Set the given pageviews per second rate, and wait a fixed amount of time thereafter for processing to settle.
- Take multiple samples for CPU and memory utilization over a defined period.
Output from top in batch mode is sent to a log file and later post-processed.
- Stop and "reset" the containers.

The example

```
cd debezium/perf
./run_experiment.sh dh 5000 20 10 1.0
```

will run an experiment for Deephaven (tag `dh`; use tag `mz` for Materialize) with a target rate of 5,000 pageviews per second.
It will wait 20 seconds after setting the target rate to begin sampling CPU and memory utilization using `top` in batch mode.
10 samples will be obtained, with a delay between samples of 1.0 seconds.

Example output from a run:

```
cfs@erke 12:18:20 ~/dh/oss3/deephaven-core/debezium/perf
$ ./run_experiment.sh dh 5000 20 10 1.0
About to run an experiment for dh with 5000 pageviews/s.

Actions that will be performed in this run:
1. Start compose services required for for dh.
2. Execute demo in dh and setup update delay logging.
3. Set 5000 pageviews per second rate.
4. Wait 20 seconds.
5. Take 10 samples for mem and CPU utilization, 1.0 seconds between samples.
6. Stop and 'reset' (down) compose.

Running experiment.

1. Starting compose.
PERF_TAG=2022.03.22.16.18.41_UTC_dh_5000

Logs are being saved to logs/2022.03.22.16.18.41_UTC_dh_5000.

2. Running demo in dh and sampling delays.
1 compiler directives added
Table users = <new>
Table items = <new>
Table purchases = <new>
Table pageviews = <new>
Table pageviews_stg = <new>
Table purchases_by_item = <new>
Table pageviews_by_item = <new>
Table item_summary = <new>
Table top_viewed_items = <new>
Table top_converting_items = <new>
Table profile_views_per_minute_last_10 = <new>
Table profile_views = <new>
Table profile_views_enriched = <new>
Table dd_flagged_profiles = <new>
Table dd_flagged_profile_view = <new>
Table high_value_users = <new>
Table hvu_test = <new>
Table pageviews_summary = <new>

1 compiler directives added
No displayable variables updated


3. Setting pageviews per second
LOADGEN Connected.
Setting pageviews_per_second: old value was 50, new value is 5000.
Goodbye.

4. Waiting for 20 seconds.

5. Sampling top.
name=redpanda, tag=CPU_PCT, mean=84.14, samples=80.0, 84.2, 85.0, 87.0, 85.0, 82.0, 85.0, 84.0, 84.2, 85.0
name=redpanda, tag=RES_GiB, mean=0.77, samples=0.7678, 0.7698, 0.7698, 0.7698, 0.7718, 0.7718, 0.7718, 0.7718, 0.7718, 0.7776
name=deephaven, tag=CPU_PCT, mean=35.21, samples=66.7, 31.7, 28.0, 31.0, 27.0, 23.0, 46.0, 47.0, 25.7, 26.0
name=deephaven, tag=RES_GiB, mean=2.40, samples=2.4, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4, 2.4

6. Stopping and 'reset' (down) compose.
Stopping core-debezium-perf_envoy_1 ... done
Stopping core-debezium-perf_grpc-proxy_1 ... done
Stopping core-debezium-perf_loadgen_1 ... done
Stopping core-debezium-perf_debezium_1 ... done
Stopping core-debezium-perf_server_1 ... done
Stopping core-debezium-perf_redpanda_1 ... done
Stopping core-debezium-perf_mysql_1 ... done
Stopping core-debezium-perf_web_1 ... done
Removing core-debezium-perf_envoy_1 ... done
Removing core-debezium-perf_grpc-proxy_1 ... done
Removing core-debezium-perf_loadgen_1 ... done
Removing core-debezium-perf_debezium_1 ... done
Removing core-debezium-perf_server_1 ... done
Removing core-debezium-perf_redpanda_1 ... done
Removing core-debezium-perf_mysql_1 ... done
Removing core-debezium-perf_web_1 ... done
Removing network core-debezium-perf_default

Experiment finished.
```

The CPU and memory utilization samples are shown on stdout and also saved to a file in the
new directory under `logs/`, in this case `logs/2022.03.22.16.18.41_UTC_dh_5000.`

## Manual testing

### Once the compose is running

Both Materialize and Deephaven are running. We now
can make them execute their respective demo scripts.
Expand Down Expand Up @@ -54,8 +175,7 @@ a command socket interface for loadgen; see `../demo/README.md`
for instructions.


Tracking the last processed pageview timestamp
==============================================
### Tracking the last processed pageview timestamp

* In DH, the `pageviews_summary` table can help track
the last pageview seen.
Expand All @@ -72,8 +192,7 @@ Tracking the last processed pageview timestamp
FROM pageviews_summary;' -U materialize -h localhost -p 6875
```

Memory and CPU requirements
===========================
## Memory and CPU requirements

The parameters used for images in the docker compose file in this
directory are geared towards high message throughput. While Deephaven
Expand Down
5 changes: 5 additions & 0 deletions debezium/perf/dh_run_demo.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/bash

set -eu

exec ../../java-client/session-examples/build/install/java-client-session-examples/bin/execute-script --python ../scripts/demo.py
5 changes: 5 additions & 0 deletions debezium/perf/dh_sample_dt.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/bash

set -eu

exec ../../java-client/session-examples/build/install/java-client-session-examples/bin/execute-script --python ../scripts/sample_dt.py
2 changes: 2 additions & 0 deletions debezium/perf/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,13 @@ services:
# build: ../../jprofiler-server
environment:
- JAVA_TOOL_OPTIONS=-Xmx${DEEPHAVEN_HEAP} -Ddeephaven.console.type=${DEEPHAVEN_CONSOLE_TYPE} -Ddeephaven.application.dir=${DEEPHAVEN_APPLICATION_DIR}
- PERF_TAG # Used to specify a subdirectory under ./logs where to store perf samples logs
# For jprofiler sessions (if you tweaked the jprofiler version in jprofiler-server/Dockerfile you need to tweak the path below):
# Then use this startup options:
# - JAVA_TOOL_OPTIONS=-agentpath:/opt/jprofiler13.0/bin/linux-x64/libjprofilerti.so=port=8849,nowait -Xmx4g -Ddeephaven.console.type=${DEEPHAVEN_CONSOLE_TYPE} -Ddeephaven.application.dir=${DEEPHAVEN_APPLICATION_DIR}
volumes:
- ../scripts:/scripts
- ./logs:/logs
# For jprofiler sessions: (change if using different port)
# ports:
# - '8849:8849'
Expand Down
5 changes: 5 additions & 0 deletions debezium/perf/mz_run_demo.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/bash

set -eu

exec docker-compose run -T mzcli -f /scripts/demo.sql
25 changes: 25 additions & 0 deletions debezium/perf/mz_sample_dt.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
#!/bin/bash

set -eu

if [ -z "$PERF_TAG" ]; then
echo "$0: PERF_TAG environment variable is not defined, aborting." 1>&2
exit 1
fi

DATA_TAG="mz_sample_dt"
OUT=logs/${PERF_TAG}/${DATA_TAG}.log

SCRIPT=$(cat <<'EOF'
while true; do
DATE_TAG=$(date -u '+%Y-%m-%d %H:%M:%S%z')
echo -n "$DATE_TAG|"
psql --csv -A -t -f /scripts/sample_dt.sql -U materialize -h materialized -p 6875
sleep 1
done
EOF
)

(nohup docker-compose run -T --entrypoint /bin/bash mzcli -c "$SCRIPT" < /dev/null >& $OUT &)

exit 0
55 changes: 55 additions & 0 deletions debezium/perf/pid_from_cmdline.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
import argparse
import datetime as dt
import os
import re
import subprocess
import sys

parser = argparse.ArgumentParser(description='Match process command line regex to pid')
parser.add_argument(
'proc_specs_strs',
metavar='PROCSPEC',
type=str, nargs='+',
help='a string of the form "name:regex" where regex should only match one process in `ps -o command` output')

args = parser.parse_args()

proc_specs = {}
for proc_spec_str in args.proc_specs_strs:
name, regex_str = proc_spec_str.split(':', maxsplit=1)
proc_specs[name] = re.compile(regex_str)

ps_lines = subprocess.run(
['ps', '-ahxww', '-o', 'pid,command' ],
stdout=subprocess.PIPE).stdout.decode('utf-8').splitlines()

matches = {}
nmatches = 0
my_pid = f'{os.getpid()}'

for ps_line in ps_lines:
pid, cmd = ps_line.split(maxsplit=1)
if pid == my_pid:
continue
for name, regex in proc_specs.items():
if re.search(regex, cmd) is not None:
prev = matches.get(name, None)
if prev is not None:
print(f"{sys.argv[0]}: found more than one match for '{name}': {prev}, {pid}, aborting",
file=sys.stderr)
sys.exit(1)
matches[name] = pid

for name in proc_specs.keys():
if matches.get(name, None) is None:
print(f"{sys.argv[0]}: couldn't find a match for {name}, aborting", file=sys.stderr)
sys.exit(1)

first = True
for name, pid in matches.items():
s = f'{name}:{pid}'
if not first:
s = ' ' + s
print(s, end='')
first = False
print()
66 changes: 66 additions & 0 deletions debezium/perf/run_experiment.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
#!/bin/sh

set -eu

if [ $# -ne 5 -o \( "$1" != 'dh' -a "$1" != 'mz' \) ]; then
echo "Usage: $0 dh|mz per_second_rate wait_seconds top_samples top_delay_seconds" 1>&2
exit 1
fi

engine="$1"
rate_per_s="$2"
wait_s="$3"
top_samples="$4"
top_delay="$5"

echo "About to run an experiment for ${engine} with ${rate_per_s} pageviews/s."
echo
echo "Actions that will be performed in this run:"
echo "1. Start compose services required for for ${engine}."
echo "2. Execute demo in ${engine} and setup update delay logging."
echo "3. Set ${rate_per_s} pageviews per second rate."
echo "4. Wait ${wait_s} seconds."
echo "5. Take ${top_samples} samples for mem and CPU utilization, ${top_delay} seconds between samples."
echo "6. Stop and 'reset' (down) compose."
echo
echo "Running experiment."
echo
echo "1. Starting compose."
export PERF_TAG=$(./start_perf_run.sh "$engine" "$rate_per_s")
echo "PERF_TAG=${PERF_TAG}"
echo
echo "Logs are being saved to logs/$PERF_TAG."
echo

echo "2. Running demo in ${engine} and sampling delays."
if [ "$engine" = "mz" ]; then
./mz_run_demo.sh
./mz_sample_dt.sh
elif [ "$engine" = "dh" ]; then
./dh_run_demo.sh
./dh_sample_dt.sh
else
echo "$0: Internal error, aborting." 1>&2
exit 1
fi
echo

echo "3. Setting pageviews per second"
./set_pageviews_per_second.sh $rate_per_s
echo

echo "4. Waiting for $wait_s seconds."
sleep "$wait_s"
echo

echo "5. Sampling top."
./sample_top.sh "$engine" "$top_samples" "$top_delay"
echo

echo "6. Stopping and 'reset' (down) compose."

./stop_all.sh
echo
echo "Experiment finished."

exit 0
Loading