Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCT/ROCM: fix memory type detection #2

Merged
merged 19 commits into from
May 3, 2022

Conversation

edgargabriel
Copy link
Collaborator

@edgargabriel edgargabriel commented Apr 29, 2022

What

fix the approach used to identify ROCm memory type.

Why ?

The current algorithm in the code lead to problems with the rocm/ipc component, since memory allocations not capable of providing an ipc handle have been trying to use this component.

How ?

ROCm memory type is defined as being of type HSA_EXT_POINTER_TYPE_HSA with the owner agent being a GPU.

Artemy-Mellanox and others added 5 commits April 28, 2022 13:00
- fixed build io-demo on Ubuntu 20.04
- on some Ubuntu distro autotools don't follow libs dependency
UCP/CORE/GTEST: Fix not deallocating memory from ucp_mem_unmap if no rcache
UCT/IB/MLX5: Fix 2G aligned MR registration
yosefe and others added 13 commits April 30, 2022 14:10
Some code blocks or function calls are considered slow-path, so the
small overhead of compiled-in conditional profiling is worth the
ease-of-use (no need to build UCX with profiling enabled). For example,
memory registration, memory pool chunks allocations, etc.
…-profile-xx-always

UCS/PROFILE: Add UCS_PROFILE_xx_ALWAYS macros, enabled in release mode
UCT/IB: Fix log_ack_req_freq field initialization
…evice-index-if

UCT/SELF: Don't add device index if have only one device
Make sure operation attributes, especially UCP_OP_ATTR_FLAG_MULTI_SEND,
would be passed to rendezvous protocol initialization and selection
while processing RTS/RTR messages.
GTEST: Silence use-after-free Fedora 36 GCC error
…ated-functions-to-ucp

UCP/API: Move deprecated functions to ucp_compat.h
…tr-mask-to

UCP/PROTO: Add op_attr_mask to rendezvous protocol selection
fix the approach used to identify ROCm memory type. ROCm memory type
is as of right now of type HSA_EXT_POINTER_TYPE_HSA with the owner agent being a GPU.
@edgargabriel edgargabriel force-pushed the topic/memtype_detection_fix branch from 4df031b to 6efdbb4 Compare May 3, 2022 20:58
@edgargabriel edgargabriel merged commit f19386b into ROCm:develop May 3, 2022
edgargabriel pushed a commit that referenced this pull request May 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants