Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map device built-ins to compiler built-ins #763

Merged
merged 5 commits into from
Feb 21, 2024
Merged

Map device built-ins to compiler built-ins #763

merged 5 commits into from
Feb 21, 2024

Conversation

linehill
Copy link
Collaborator

@linehill linehill commented Jan 26, 2024

Map a set of HIP built-ins referenced by HeCBench benchmarks to compiler built-ins. The myocyte-hip benchmark seems to benefit from this the most - most likely due to:

  • HIP built-in calls with compile time values which are constant-folded.

  • HIP built-in calls which are simplified such as pow(x, 2) --> x*x.

@linehill linehill marked this pull request as ready for review February 2, 2024 13:42
Copy link
Collaborator

@pvelesko pvelesko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some questions & small things

bitcode/c_to_opencl.def Outdated Show resolved Hide resolved
Comment on lines 107 to +108

if(CHIP_ENABLE_NON_COMPLIANT_DEVICELIB_CODE)
# add support for calling certain LLVM builtins that are not supported by HIP/CUDA
add_custom_command(
OUTPUT "${CMAKE_CURRENT_BINARY_DIR}/BC/c_to_opencl.bc"
DEPENDS "${CMAKE_CURRENT_SOURCE_DIR}/c_to_opencl.c"
COMMAND "${CMAKE_CXX_COMPILER}" ${BITCODE_C_COMPILE_FLAGS}
-o "${CMAKE_CURRENT_BINARY_DIR}/BC/c_to_opencl.bc"
-c "${CMAKE_CURRENT_SOURCE_DIR}/c_to_opencl.c"
COMMENT "Building c_to_opencl.bc"
VERBATIM)
list(APPEND DEPEND_LIST "${CMAKE_CURRENT_BINARY_DIR}/BC/c_to_opencl.bc")
endif()
add_custom_command(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we getting rid of the option? These are still non-compliant, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we getting rid of the option?

These are required for using compiler built-ins (see c_to_opencl.def for explanation).

These are still non-compliant, right?

Non-compliant respect to what?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So why not keep the option? In atomics.hh we still rely on this option:

#ifdef CHIP_ENABLE_NON_COMPLIANT_DEVICELIB_CODE
// At least rocPRIM tests call the unsigned long variant although it's not
// listed in the user manual. Annoyingly, size_t is typically defined as
// unsigned long.
// FIXME: We should check that unsigned long is 64bits for the host.
extern "C++" inline __device__ unsigned long atomicAdd(unsigned long *address,
                                                       unsigned long val) {
  return __chip_atomic_add_l(address, val);
}
#endif

Non-compliant respect to what?

#700

"when CHIP_ENABLE_NON_COMPLIANT_DEVICELIB_CODE` was defined, which brought in the atomicAdd() for longs (see 71b5c82) which is not defined in the HIP spec."

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the concern here is that removing the option in the bitcode library would expose non-compliant functions to the users, that shouldn’t be happening because they are not declared in the HIP headers. Therefore, removing the CHIP_ENABLE_NON_COMPLIANT_DEVICELIB_CODE option check shouldn't have negative consequences, AFAICS.

bitcode/devicelib.cl Show resolved Hide resolved
Comment on lines 849 to 857

#ifdef CHIP_ENABLE_NON_COMPLIANT_DEVICELIB_CODE

// See c_to_opencl.def for details.
#define DEF_UNARY_FN_MAP(NAME_, TYPE_) \
TYPE_ MAP_PREFIX##NAME_(TYPE_ x) { return NAME_(x); }
#define DEF_BINARY_FN_MAP(NAME_, TYPE_) \
TYPE_ MAP_PREFIX##NAME_(TYPE_ x, TYPE_ y) { return NAME_(x, y); }
#define MAP_PREFIX __chip_c2ocl_
#define DEF_UNARY_FN_MAP(FROM_FN_, TO_FN_, TYPE_) \
TYPE_ __chip_c2ocl_##FROM_FN_(TYPE_ x) { return TO_FN_(x); }
#define DEF_BINARY_FN_MAP(FROM_FN_, TO_FN_, TYPE_) \
TYPE_ __chip_c2ocl_##FROM_FN_(TYPE_ x, TYPE_ y) { return TO_FN_(x, y); }
#include "c_to_opencl.def"
#undef UNARY_FN
#undef BINARY_FN
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remind me why we have this here? If we already have this in c_to_opencl.c which we compile and then link into the final .bc?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is explained in the c_to_opencl.def.

bitcode/c_to_opencl.c Show resolved Hide resolved
And prepare for follow-up patch where HIP built-ins are mapped to
compiler built-ins. Some of the compiler built-ins may be lowered to C
math calls by the compiler which the cmath->OpenCL mapping takes care
of.
Map a set of HIP built-ins referenced by HeCBench benchmarks to
compiler built-ins. The myocyte-hip benchmark seems to benefit from
this the most - most likely due to:

* HIP built-in calls with compile time values which are constant-folded.

* HIP built-in calls which are simplified such as `pow(x, 2) --> x*x`.
Also, just in case mark all HIP->compiler built-in functions 'static
inline'.
@pvelesko
Copy link
Collaborator

rebased

@pvelesko pvelesko merged commit 5213ad6 into main Feb 21, 2024
27 checks passed
@pvelesko pvelesko deleted the use-builtins branch February 21, 2024 06:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants