Increases benchmark warm-up and decreases the threshold value #268

cheqianh · 2023-05-16T20:26:36Z

This PR
(1) Addressed a memory leak issue - link
(2) Experimented the GHA runners' resource limitation and switched to MacOS for 1k iterations and warmups - link
(3) Added a method for Ion binary/text conversion - link
(4) Fixed a comparison result mismatch issue by sorting the combination list - link

The threshold is set to 0.2 for now, we will work on its enhancement in the future. The latest performance detection failed due to the variance.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

…lue.

tgregg · 2023-05-16T23:12:17Z

A threshold of 0.5 still allows a 50% performance regression, right? That's still not going to be low enough for our needs. How did you select 150 for warmups/iterations? Are you seeing diminishing returns in terms of reduced variance as you increase the warmups/iterations, or does the regression detection workflow just start taking too long?

cheqianh · 2023-05-17T00:07:57Z

If 0.5 doesn't satisfy our needs, we need to consider other improvements as well.

There is an error when running 200+ warm-ups - actions/runner-images#6680.

The Github runner only allocate fixed resource (e.g., memory) to jobs. Running over 200 times make the job cancel automatically. Additionally the results of relative-difference are usually < 0.3 but sometimes one sample file (real_worlds_data_1) fail with 0.5+ perf regressions (and if the workflow fails, it's always this file).

One possible solution to enable more iteration and warmups is using a larger runner - https://docs.github.com/en/actions/using-github-hosted-runners/using-larger-runners

cheqianh · 2023-05-18T21:38:54Z

Found a small memory leak in each simpleion.dumps()/loads() call, which might be the root to cause the workflow exceeds the resource limit. Investigating the root.

cheqianh · 2023-05-21T22:53:23Z

Confirmed that execution_with_command(['read', file, '--api', 'load_dump', '--api', 'streaming']) has memory leak issues for simpleion load API when reading testStructs.10n. Didn't see any memory leak for dump API.

cheqianh · 2023-05-22T05:00:18Z

Fixed the memory leak for read and it worked - 9858b44

but still failed for large execution number. Investigating.

popematt · 2023-05-31T22:10:11Z

.github/workflows/performance-regression.yml

@@ -12,7 +12,7 @@ jobs:
    name: Detect Regression
    needs: PR-Content-Check
    if: ${{ needs.PR-Content-Check.outputs.result == 'pass' }}
-    runs-on: ubuntu-latest
+    runs-on: macos-latest # ubuntu-latest


In general, we should only use MacOS runners if we need to test specifically that something works correctly on MacOS or we're distributing a MacOS-specific binary.

What's the rationale for running on MacOS here?

I'm trying to run more warm-ups and iterations for the benchmark-cli command in order to generate a consistent threshold value to help us identify the performance regression. However, command with 150+ executions will hit the resource limitation of the GHA linux runner.

Which resource limit? Is it causing the job to timeout?

One of the workflows run into this issue - actions/runner-images#6680

I found another issue that causes the workflow to generate a wrong threshold value. After fixing that issue, the threshold value seems to be more stable. But ideally we will eventually benchmark performance on all popular platforms.

cheqianh · 2023-06-07T22:36:34Z

This PR
(1) Addressed a memory leak issue - link
(2) Experimented the GHA runners' resource limitation and switched to MacOS for 1k iterations and warmups - link
(3) Added a method for Ion binary/text conversion - link
(4) Fixed a comparison result mismatch issue by sorting the combination list - link

The threshold is set to 0.2 for now, we will work on its enhancement in the future. An example performance detection failed due to variance.

.github/workflows/performance-regression.yml

amazon/ionbenchmark/Format.py

cheqianh · 2023-06-09T00:19:16Z

I changed the target commit to the main branch so we don't have to create a new PR for that. In addition, I

Opened an issue for avoiding file copy - ion-python-benchmark-cli's format conversion shouldn't always copy the input file #269
Updated the GH issue for identifying the threshold values by results percentile - ion-python-benchmark-cli's threshold value need to be more accurate. #265

cheqianh and others added 3 commits May 16, 2023 13:25

Increases benchmarking warm-up numbers and decreases the threshold va…

100bb03

…lue.

modify source code to trigger perf detection.

9c5ce67

Revert

25a5464

cheqianh requested a review from tgregg May 16, 2023 22:18

cheqianh marked this pull request as ready for review May 16, 2023 22:18

cheqianh mentioned this pull request May 18, 2023

C extension memory leak. #155

Open

cheqianh force-pushed the master branch from 5e6c32c to ee7f0d5 Compare May 22, 2023 05:50

cheqianh force-pushed the master branch from 981c62d to 0184a1d Compare May 30, 2023 22:25

Migrate to MacOS, decrease threshold to 0.25, iterations/warmups to 1k

8cb7ee4

cheqianh force-pushed the master branch from 3cc893d to 8cb7ee4 Compare May 31, 2023 19:27

Change to 1500, with 0.25 threshold.

dc36203

popematt reviewed May 31, 2023

View reviewed changes

cheqianh added 13 commits May 31, 2023 19:16

Change to 1000, with 0.4 threshold.

66135b9

Sort the list to avoid mismatch comparsion.

d4521fe

Sort the list to avoid mismatch comparsion.

83a5068

threshold to 0.2?

3307fc3

100 warmups?

4be9e09

focusing on new commit

06cd10a

focusing on new commit

3c9c71f

changes to 0.25

55945a3

changes to 0.25, iterations/warmups to 200

423113a

Adds binary conversion, test it again

88f1a66

Adds the memory leak fix

ee416b5

overall check

07c3eae

let's see 800 times

65f56d2

cheqianh added 19 commits June 1, 2023 15:49

800 iterations with 0.2 threhsold

02236fb

Fixed an unneceesary if.else

01a76b0

Test - 800 iterations,0.2 threshold

29679bc

test

5d1410c

test

962dbeb

test

df44a8c

test

485ef53

test- cpython only

ef8f62d

test - cpython only

8c54eac

s

37b02fb

ready

769927c

ready

d7ba4e2

threshold - 0.1

0b91e92

threshold - 0.15

6a0cd18

threshold - 0.15

ec9da28

threshold - 0.15

78dfab2

adds formats conversion

23b3de2

Adds test

5df4b43

test

3a8f354

a

cabb8e6

tgregg reviewed Jun 8, 2023

View reviewed changes

.github/workflows/performance-regression.yml Outdated Show resolved Hide resolved

.github/workflows/performance-regression.yml Outdated Show resolved Hide resolved

cheqianh added 2 commits June 8, 2023 13:14

warmup -> 200, iterations -> 1.5k

57ecdae

warmup -> 200, iterations -> 1.5k

13e4aae

tgregg approved these changes Jun 8, 2023

View reviewed changes

amazon/ionbenchmark/Format.py Show resolved Hide resolved

clean up, change to master

cf3bef7

cheqianh merged commit 35f07ca into amazon-ion:master Jun 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increases benchmark warm-up and decreases the threshold value #268

Increases benchmark warm-up and decreases the threshold value #268

cheqianh commented May 16, 2023 •

edited

Loading

tgregg commented May 16, 2023

cheqianh commented May 17, 2023 •

edited

Loading

cheqianh commented May 18, 2023 •

edited

Loading

cheqianh commented May 21, 2023 •

edited

Loading

cheqianh commented May 22, 2023 •

edited

Loading

popematt May 31, 2023

cheqianh Jun 1, 2023

popematt Jun 1, 2023

cheqianh Jun 1, 2023 •

edited

Loading

cheqianh Jun 1, 2023 •

edited

Loading

cheqianh commented Jun 7, 2023 •

edited

Loading

cheqianh commented Jun 9, 2023 •

edited

Loading

Increases benchmark warm-up and decreases the threshold value #268

Increases benchmark warm-up and decreases the threshold value #268

Conversation

cheqianh commented May 16, 2023 • edited Loading

tgregg commented May 16, 2023

cheqianh commented May 17, 2023 • edited Loading

cheqianh commented May 18, 2023 • edited Loading

cheqianh commented May 21, 2023 • edited Loading

cheqianh commented May 22, 2023 • edited Loading

popematt May 31, 2023

Choose a reason for hiding this comment

cheqianh Jun 1, 2023

Choose a reason for hiding this comment

popematt Jun 1, 2023

Choose a reason for hiding this comment

cheqianh Jun 1, 2023 • edited Loading

Choose a reason for hiding this comment

cheqianh Jun 1, 2023 • edited Loading

Choose a reason for hiding this comment

cheqianh commented Jun 7, 2023 • edited Loading

cheqianh commented Jun 9, 2023 • edited Loading

cheqianh commented May 16, 2023 •

edited

Loading

cheqianh commented May 17, 2023 •

edited

Loading

cheqianh commented May 18, 2023 •

edited

Loading

cheqianh commented May 21, 2023 •

edited

Loading

cheqianh commented May 22, 2023 •

edited

Loading

cheqianh Jun 1, 2023 •

edited

Loading

cheqianh Jun 1, 2023 •

edited

Loading

cheqianh commented Jun 7, 2023 •

edited

Loading

cheqianh commented Jun 9, 2023 •

edited

Loading