Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[kokkos] serial backend performance fix #338

Merged
merged 1 commit into from
Mar 16, 2022

Conversation

markdewing
Copy link
Contributor

The Kokkos serial backend is slower than the plain serial code - see #297 for more details.

One cause is addressed here, where the Kokkos backend loops over the entire array (size 1024) rather than the number of clusters (about 10).
This loop also sets ok and newclusId to zero. Those get set later and these assignments can be removed.

The Kokkos serial backend is slower than the plain serial code.
One cause is here, where the Kokkos backend loops over the entire array
(size 1024) rather than the number of clusters (about 10).
This loop also sets ok and newclusId to zero.  Those get set later and
these assignments can be removed.
@makortel
Copy link
Collaborator

On Xeon Gold 5220 I'm seeing ~4 % improvement with one kokkos --serial process. I first saw an opposite effect on Cori, but found a mistake in my testing setup. Once I get consistent results there I'll merge this PR.

@makortel
Copy link
Collaborator

On Cori I see ~5 % improvement on kokkos --serial when running one process on otherwise empty node
kokkos_serial_throughput

1-thread case is now witin 4 % of the serial program.

On a fully loaded socket the improvement is smaller but still clearly visible

fullsocket_kokkos_serial_throughput

@makortel makortel merged commit 1f9bed6 into cms-patatrack:master Mar 16, 2022
@markdewing markdewing deleted the kokkos_serial_fix branch March 17, 2022 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants