Part of #1404: Avoid flushing aggregators in lockstep #1826

ronawho · 2022-10-05T14:41:52Z

When flushing, previously all aggregators would flush to locale 0, then 1 and so on in lockstep. This created a many-to-one bottleneck, especially at scale. This updates aggregators to start flushing to the locale they last aggregated a value for. Keeping track of the last locale should be trivially fast and getting rid of the lockstep comm behavior should improve performance of all aggregated operations, including sort, at higher scales.

I'll have more comprehensive performance results later, but on an older 240 node SGI InfiniBand machine I see a small sort (512 KiB per node) go from ~1.6s to ~0.7s and a medium sort (512 MiB per node) go from ~2.5s to ~1.8s.

Part of #1404

When flushing, previously all aggregators would flush to locale 0, then 1 and so on in lockstep. This created a many-to-one bottleneck, especially at scale. This updates aggregators to start flushing to the locale they last aggregated a value for. Keeping track of the last locale should be trivially fast and getting rid of the lockstep comm behavior should improve performance of all aggregated operations, including sort, at higher scales. I'll have more comprehensive performance results later, but on an older 240 node SGI InfiniBand machine I see a small sort (512 KiB per node) go from ~1.6s to ~0.7s and a medium sort (512 MiB per node) go from ~2.5s to ~1.8s. Part of 1404

Ethan-DeBandi99

Looks good

@benharsh

Avoid flushing aggregators in lockstep [reviewed by @benharsh] When flushing, previously all aggregators would flush to locale 0, then 1 and so on in psdueo lockstep. This created a many-to-one bottleneck, especially at scale. This updates aggregators to start flushing to the locale they last aggregated a value for. Keeping track of the last locale is trivially fast and getting rid of the lockstep comm behavior improves performance, particularly at scale. This change was made to Arkouda's aggregators in Bears-R-Us/arkouda#1826 and this PR effectively pulls in an "upstream" change. Eventually we just want Arkouda to use Chapel's aggregators, but there have been frequent enough changes to aggregators that it's been hard to time.

ronawho changed the title ~~Avoid flushing aggregators in lockstep~~ Part of #1404: Avoid flushing aggregators in lockstep Oct 5, 2022

stress-tess requested review from stress-tess, Ethan-DeBandi99 and mhmerrill October 5, 2022 18:30

stress-tess approved these changes Oct 5, 2022

View reviewed changes

Ethan-DeBandi99 approved these changes Oct 6, 2022

View reviewed changes

stress-tess merged commit d60cb36 into Bears-R-Us:master Oct 6, 2022

ronawho deleted the avoid-lockstep-agg-flush branch October 13, 2022 20:12

ronawho mentioned this pull request Nov 1, 2022

Reduce sort overheads #1404

Open

ronawho mentioned this pull request Nov 29, 2022

Avoid flushing aggregators in lockstep chapel-lang/chapel#21106

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Part of #1404: Avoid flushing aggregators in lockstep #1826

Part of #1404: Avoid flushing aggregators in lockstep #1826

ronawho commented Oct 5, 2022

Ethan-DeBandi99 left a comment

Part of #1404: Avoid flushing aggregators in lockstep #1826

Part of #1404: Avoid flushing aggregators in lockstep #1826

Conversation

ronawho commented Oct 5, 2022

Ethan-DeBandi99 left a comment

Choose a reason for hiding this comment