Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Readd, optimize and profile memcpy_async-based transform kernel for A100 #2361

Open
bernhardmgruber opened this issue Sep 4, 2024 · 0 comments · May be fixed by #2394
Open

Readd, optimize and profile memcpy_async-based transform kernel for A100 #2361

bernhardmgruber opened this issue Sep 4, 2024 · 0 comments · May be fixed by #2394
Assignees
Labels
cub For all items related to CUB

Comments

@bernhardmgruber
Copy link
Contributor

bernhardmgruber commented Sep 4, 2024

The inital PR for cub::DeviceTransform #2086 had a dedicated kernel using cg::memcpy_async for A100. This kernel was removed during the review, to make way for the performance validated H100 kernel. We should properly validate the cg::memcpy_async kernel performance and bring it back.

@github-project-automation github-project-automation bot moved this to Todo in CCCL Sep 4, 2024
@bernhardmgruber bernhardmgruber changed the title Readd, optimize and profile memcpy_async kernel for A100 Readd, optimize and profile memcpy_async-based transform kernel for A100 Sep 4, 2024
@bernhardmgruber bernhardmgruber self-assigned this Sep 4, 2024
@bernhardmgruber bernhardmgruber added the cub For all items related to CUB label Sep 4, 2024
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this issue Sep 9, 2024
@bernhardmgruber bernhardmgruber linked a pull request Sep 9, 2024 that will close this issue
@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Progress in CCCL Sep 9, 2024
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this issue Sep 9, 2024
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this issue Sep 9, 2024
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this issue Nov 4, 2024
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this issue Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cub For all items related to CUB
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

1 participant