Fix performance regression in reductions benchmark #3874

jeremiah-corrado · 2024-10-25T19:04:07Z

Fix a performance regression caused by recent refactoring of ReductionMsg.

The refactor made total reductions (i.e., reducing an array to a single scalar value) return that scalar value via a 1-element array. This resulted in much cleaner code, but also added overhead for the total-reduction case. This PR separates total reductions into separate commands s.t. they can return a scalar directly.

Also fixes a bug in register_commands.py, where any unrecognized return type was being treated as a symbol-table entry.

Signed-off-by: Jeremiah Corrado <[email protected]>

ajpotts · 2024-10-28T15:10:30Z

It looks like the benchmark is failing b/c prod does not handle overflow. numpy seems to set the overflow result to 0.

I get this print statement from the failing case in ./benchmarks/reduce.py --correctness-only localhost 5555

op
prod

npa
[   1    2    3 ... 9997 9998 9999]

random
False

dtype
int64

SEED
None

npr
0

akr
-9223372036854775808

np.isclose(npr, akr)
False
Traceback (most recent call last):
  File "/home/amandapotts/git/arkouda/./benchmarks/reduce.py", line 176, in <module>
    check_correctness(dtype, args.randomize, args.seed)
  File "/home/amandapotts/git/arkouda/./benchmarks/reduce.py", line 123, in check_correctness
    assert np.isclose(npr, akr)
           ^^^^^^^^^^^^^^^^^^^^
AssertionError


``

jeremiah-corrado · 2024-10-28T15:21:52Z

Thanks for pointing that out! I'll see if I can fix it.

Signed-off-by: Jeremiah Corrado <[email protected]>

jeremiah-corrado · 2024-10-28T17:26:18Z

I'm seeing the following performance for the reduce benchmark (on the same system where the linked graph is generated):

numLocales = 16, N = 1,600,000,000
sum = 1279999999200000000
  sum Average time = 0.0070 sec
  sum Average rate = 1696.42 GiB/sec
prod = 0
  prod Average time = 0.0069 sec
  prod Average rate = 1722.76 GiB/sec
min = 1
  min Average time = 0.0068 sec
  min Average rate = 1748.09 GiB/sec
max = 1599999999
  max Average time = 0.0067 sec
  max Average rate = 1775.76 GiB/sec

This is around 10% lower than the results we've seen historically, which could be attributed to a difference in system configuration or something, but could also indicate that there are some other (more minor) performance fixes needed. I'd propose we merge this PR and see how the nightly performance testing graph responds before looking into this further.

stress-tess

looks good to me!

ajpotts

Thanks for doing this!

Adds an annotation for a recent Arkouda reduction performance regression: - initial regression caused by: Bears-R-Us/arkouda#3845 - resolved by: Bears-R-Us/arkouda#3874 [Reviewed by nobody; annotations update]

jeremiah-corrado added 3 commits October 25, 2024 12:46

remove calls to 'flatten' from min, max, prod, and sum reductions

0e43867

Signed-off-by: Jeremiah Corrado <[email protected]>

fix sum reduction performance for sclar return case

3b852ad

Signed-off-by: Jeremiah Corrado <[email protected]>

make same modifications to prod, min, and max

5985f79

Signed-off-by: Jeremiah Corrado <[email protected]>

fix prod overflow and mypy errors

87ff6d8

Signed-off-by: Jeremiah Corrado <[email protected]>

jeremiah-corrado marked this pull request as ready for review October 28, 2024 17:26

jeremiah-corrado requested review from ajpotts and stress-tess October 28, 2024 17:26

stress-tess approved these changes Oct 28, 2024

View reviewed changes

ajpotts approved these changes Oct 29, 2024

View reviewed changes

ajpotts added this pull request to the merge queue Oct 29, 2024

Merged via the queue into Bears-R-Us:master with commit 879192f Oct 29, 2024
11 checks passed

jeremiah-corrado deleted the reduction-perf-fix branch October 30, 2024 15:04

stonea mentioned this pull request Oct 30, 2024

add annotations for arkouda reduce regression and resolution chapel-lang/chapel#26173

Merged

ajpotts mentioned this pull request Nov 22, 2024

reduction performance improvements #3911

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix performance regression in reductions benchmark #3874

Fix performance regression in reductions benchmark #3874

jeremiah-corrado commented Oct 25, 2024 •

edited

Loading

ajpotts commented Oct 28, 2024

jeremiah-corrado commented Oct 28, 2024

jeremiah-corrado commented Oct 28, 2024

stress-tess left a comment

ajpotts left a comment

Fix performance regression in reductions benchmark #3874

Fix performance regression in reductions benchmark #3874

Conversation

jeremiah-corrado commented Oct 25, 2024 • edited Loading

ajpotts commented Oct 28, 2024

jeremiah-corrado commented Oct 28, 2024

jeremiah-corrado commented Oct 28, 2024

stress-tess left a comment

Choose a reason for hiding this comment

ajpotts left a comment

Choose a reason for hiding this comment

jeremiah-corrado commented Oct 25, 2024 •

edited

Loading