-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix radix sort test #34929
Fix radix sort test #34929
Conversation
Take into account the actual size of the type being sorted. Use an union instead of a cast to keep the compiler happy about the aliasing rules.
type bugfix |
please test |
please test with cms-externals/eigen-git-mirror#7 for CMSSW_12_1_X/slc7_amd64_gcc11 |
+heterogeneous |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-34929/24737
|
A new Pull Request was created by @fwyzard (Andrea Bocci) for master. It involves the following packages:
can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
Fixes #34917 . |
+1 |
The |
Ah, sorry Andrea. I saw the message that the tests ended succesfully (together with your "+1" for the heterogeneous category) and I thought they were the ones. |
No problem, testing locally it worked, so I'm optimistic :-) |
-1 Failed Tests: Build HeaderConsistency The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:
You can see more details here: BuildI found compilation error when building: >> Compiling edm plugin /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_1_X_2021-08-16-1100/src/L1Trigger/DTTrigger/src/DTTrig.cc >> Compiling edm plugin /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_1_X_2021-08-16-1100/src/L1Trigger/DTTrigger/src/DTTrigProd.cc >> Compiling edm plugin /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_1_X_2021-08-16-1100/src/L1Trigger/DTTrigger/src/DTTrigTest.cc >> Compiling edm plugin /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_1_X_2021-08-16-1100/src/L1Trigger/DTTrigger/src/SealModule.cc /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_1_X_2021-08-16-1100/src/L1Trigger/DTTrigger/src/DTTrigTest.cc: In member function 'virtual void DTTrigTest::beginRun(const edm::Run&, const edm::EventSetup&)': /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_1_X_2021-08-16-1100/src/L1Trigger/DTTrigger/src/DTTrigTest.cc:198:23: error: 'this' pointer is null [-Werror=nonnull] 198 | my_trig->createTUs(iEventSetup); | ~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~ In file included from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_1_X_2021-08-16-1100/src/L1Trigger/DTTrigger/interface/DTTrigTest.h:26, from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_1_X_2021-08-16-1100/src/L1Trigger/DTTrigger/src/DTTrigTest.cc:17: /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_1_X_2021-08-16-1100/src/L1Trigger/DTTrigger/interface/DTTrig.h:78:8: note: in a call to non-static member function 'void DTTrig::createTUs(const edm::EventSetup&)' |
OK, there are other (preexisting, I assume) errors in |
uintT_t<T> u; | ||
} c; | ||
c.t = t; | ||
c.u = c.u >> shift << shift; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure this is actually undefined behavior as you are not supposed to set a union using one type and read the union using another type. I believe the recommend way to do this is to use memcpy
.
uintT_t<T> u;
memcpy(&u, &t, sizeof(u));
u = u >>shift <<shift;
memcpy(&t, &u, sizeof(t));
I played around with that recently on godbolt (for reasons other than this PR) and found compilers know about memcpy and optimize out the calls and just do the 'right' thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think section 11.5.6.3 of the C++ 20 standard demonstrates that it is undefined behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure the union pattern is used elsewhere in CMSSW too, so we should address those as well.
IMHO the C++ standard has kicked itself in a corner with the strict
aliasing rules, and is now clutching at straws.
Personally, I'm fine with undefined behaviour, as long as the compiler does
what I want it to do without complaining.
If you want to propose an implementation based on memcpy and memset, go
ahead - it may even simplify the code avoiding the shifts (but please add
an assert on the endianness of the machine).
.A
|
Looks like C++ 20 is finally trying to address this |
I've done some tests using template <int N, typename T, typename SFINAE = std::enable_if_t<N <= sizeof(T)>>
void truncate1(T& t) {
const int shift = 8 * (sizeof(T) - N);
union {
T t;
uintT_t<T> u;
} c;
c.t = t;
c.u = c.u >> shift << shift;
t = c.t;
} template <int N, typename T, typename SFINAE = std::enable_if_t<N <= sizeof(T)>>
void truncate2(T& t) {
const int shift = 8 * (sizeof(T) - N);
uintT_t<T> u;
std::memcpy(&u, &t, sizeof(T));
u = u >> shift << shift;
std::memcpy(&t, &u, sizeof(T));
} template <int N, typename T, typename SFINAE = std::enable_if_t<N <= sizeof(T)>>
void truncate3(T& t) {
char* bytes = reinterpret_cast<char*>(&t);
static_assert(__BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__, "Only little endian architectures are supported");
std::memset(bytes, 0x00, sizeof(T) - N);
} Host compilers (GCC 11.2, clang 12.0) always generate the same code for
Device compilers (NVCC 11.3, clang 11.0
All compilers (host and device) always generate one or more stores of |
Summary
So... keep things as they are for the time being, and revisit once we have C++20 ? |
P.S. my play area on godbolt is here: https://godbolt.org/z/75csPcK1q |
I'd be fine with that. Hopefully we get into C++20 before we need to consider other compilers than gcc or clang for pieces of code using union for type punning (that appears to be guaranteed behavior in gcc, I didn't find quickly what clang guarantees). |
PR description:
Fix the compilation errors in HeterogeneousCore/CUDAUtilities/test/radixSort_t.cu:
PR validation:
gpuRadixSort_t
compiles and runs fine.