-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge develop branch into master for upcoming 1.0.x release #54
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
g++ guarding grid_launch
removed grid launch constructor to remove runtime errors
Instead of hardcoding the HSA_AMDGPU_GPU_TARGET at compile time, autodetect it at runtime from the KFD topology. Change-Id: I00af68084869ab4d439e70cf8816c1c8868f224d
[CMake] Autodetect HSA_AMDGPU_GPU_TARGET
Use new workitem intrinsics + range metadata, correct some attributes on functions, and canonicalize. Correct range metadata to be maximum theoretical workgroup size. Change-Id: I9dedbe2dd62753858ccd0eb7841e228873a2c031
Cleanup wrapper IR functions
Compile with codes that use restrict for other purposes.
Need to move this code so we can re-enable the optimization.
this will improve hcc runtime performance when multiple kernels are used in a program
This reverts commit cb0f883.
use FNV-1a for kernel indexing instead of md5
Use lit config variables which would be initialized at cmake configuration time.
Use @llvm.readcyclecounter() intrinsic, which would be lowered to s_memtime GCN ISA. This fixes one failing hcc unit test (HSAIL/clock.cpp).
A new unit test is introduced to check API hc::__cycle_u64()
As there is no corresponding HSAIL instruction, we make this function always return 0 for HSAIL backend
s_memrealtime keeps a constant clock frequency and is not affected by DPVS
Implement with s_memtime ISA.
Define const member function if needed
erase empty elements to prevent the map size growing
This reverts commit 705d2c2.
Conflicts: lib/hsail-amdgpu-wrapper.ll
shfl implementation for LC
…hcc_backend__ macro
New ROCm KFD has solved race condition issues. Increase test threads from 2 to 8.
This fixes 2 failing unit tests: - memcpy_symbol1 - memcpy_symbol3
The original size was 104, which was too small for different kernel code objects with nearly identical kernel names. Change it to 512 to fix failing unit tests.
Fix unit tests which doesn't take into consideration that not necessary all hc::accelerator are HSA agents.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Major features introduced:
With HSAIL backend, there is only one failing unit now. @scchan would take care of it.
Failing Tests (1):
CPPAMP :: Unit/HSAIL/shfl_xor.cpp