Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Implement atomic_ref<Integral> #203

Merged
merged 2 commits into from
Oct 2, 2021
Merged

Implement atomic_ref<Integral> #203

merged 2 commits into from
Oct 2, 2021

Conversation

wmaxey
Copy link
Member

@wmaxey wmaxey commented Sep 9, 2021

Changelog Summary

Builds upon previous atomic refactor to create implementations of cuda::atomic_ref and cuda::std::atomic_ref

Related Work

#179 - Refactor included back-end pathways for atomic_ref types.

Details

This implementation conforms to the C++20 specification of atomic_ref.

Deviations

  • cuda::std::atomic_ref is backported to C++11.
  • cuda::atomic_ref has fetch_min/fetch_max overloads

Motivation

atomic_ref provides a modern abstraction for visibility and ordering of reads and writes to memory according to the C++ memory model. This interface can be used to replace uses of CUDA specific atomicOperation(_Scope) functions in device code and provides interoperability with host code.

Previously:

__global__ atomic_kernel(int *out) {
  // Calculate some work per thread
  int bucket = threadIdx.x%16; 
  // Threads maybe have a different bucket, e.g. when computing a histogram
  atomicAdd(out+bucket, result); // Can't use `atomic<int>` on raw pointers :(
}

With atomic_ref:

#include <cuda/atomic>

__global__ atomic_kernel(int *out) {
  // Calculate some work per thread
 cuda::atomic_ref<int, cuda::thread_scope_block> bucket(out[threadIdx.x%16]);

  // Each thread may have a different bucket, e.g. when computing a histogram
 bucket += result;
}

Design

Builds upon existing atomic pathways to create the atomic_ref front end. There were several changes required to properly cast into the correct lower level interface.

Tests for atomic_ref were replicated out of other atomic baseline tests, these provide coverage for both pointer and integral operations.

Host only libcxx tests were also added to provide coverage from a host standard library view and to ensure that the implementation is sound.

Testing

Virtuals:

New tests have been added that exercise all interfaces of atomic_ref

@wmaxey wmaxey requested a review from griwes September 10, 2021 00:37
@jrhemstad
Copy link
Collaborator

This implementation conforms to the C++20 specification of atomic_ref.

What are we doing about padding bits? std::atomic_ref explicitly states it only compares the value representations.

@wmaxey wmaxey force-pushed the feature/atomic_ref branch 2 times, most recently from 3c22b51 to 38ad966 Compare September 16, 2021 00:05
Copy link
Collaborator

@griwes griwes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me; it's pretty much what I'd expect this to look like after the earlier refactor.

This probably needs a number of more tests, for I don't think all of them are crucial. What is definitely crucial is ref-to-pointer tests in the libc++ layer (similar to what you have for integral_ref already). Other than that, ship it once it's nicely squashed and tested!

include/cuda/std/atomic Outdated Show resolved Hide resolved
libcxx/include/support/atomic/cxx_atomic.h Outdated Show resolved Hide resolved
@wmaxey wmaxey marked this pull request as ready for review October 2, 2021 18:27
@wmaxey wmaxey force-pushed the feature/atomic_ref branch 2 times, most recently from 03ee55e to 4b77f5c Compare October 2, 2021 18:38
@wmaxey wmaxey merged commit 37802be into main Oct 2, 2021
@wmaxey wmaxey deleted the feature/atomic_ref branch October 2, 2021 18:43
@wmaxey wmaxey changed the title WIP: Implement atomic_ref Implement atomic_ref<Integral> Oct 2, 2021
@wmaxey wmaxey changed the title Implement atomic_ref<Integral> Implement atomic_ref<Integral> Oct 2, 2021
@wmaxey wmaxey added this to the 1.7.0 milestone Oct 5, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants