[QST] How to use splitk in cutlass python? #1295

Miroier · 2024-01-09T08:25:39Z

What is your question?

Background: A100, nvidia-cutlass 3.3.0.0

When k is much larger than m n , I want to find a cutlass kernel with faster running time than cupy.dot because cupy.dot does not reach the peak performance of A100. My data size is m=n=128,k=16384. I used cutlass profiler to find a potentially faster kernel. But I found that there is no way to specify split_k_mode and split_k_slices in cutlass.op.Gemm.

jackkosaian · 2024-01-09T13:18:24Z

Good catch. This is currently missing from cutlass.op.Gemm. We will add it soon.

Miroier · 2024-01-09T13:31:52Z

If I want to use splitk in python, is it a feasible way to write the c++ code first, compile it into dynamic libraries and then load the dynamic libraries with python's ctypes? Or do you have any other better suggestion?

jackkosaian · 2024-01-09T15:57:00Z

My suggestion would be to use tools like Pybind to create Python-C++ bindings for calling into a CUTLASS C++ kernel via Python.

For example, you can take a look at how PyTorch CUDA extensions can be created for a CUTLASS kernel by following this unit test. If you set jit=False here, you can build an offline library called "gemm_mod", by running python setup.py install on the generated setup.py file. You can also inspect the contents of the generated gemm_mod.cu and gemm_mod.cpp to see how the C++-Python bindings are set up.

You could follow a similar pattern by pasting your desired CUTLASS C++ kernel into gemm_mod.cu. You may need to make some slight changes to other parts of the generated files to match your kernels.

github-actions · 2024-02-09T18:05:02Z

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

Miroier added ? - Needs Triage question Question labels Jan 9, 2024

mnicely added feature request New feature or request python and removed ? - Needs Triage labels Jan 10, 2024

github-actions bot added the inactive-30d label Feb 9, 2024

Miroier closed this as completed Feb 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QST] How to use splitk in cutlass python? #1295

[QST] How to use splitk in cutlass python? #1295

Miroier commented Jan 9, 2024

jackkosaian commented Jan 9, 2024

Miroier commented Jan 9, 2024

jackkosaian commented Jan 9, 2024 •

edited

Loading

github-actions bot commented Feb 9, 2024

[QST] How to use splitk in cutlass python? #1295

[QST] How to use splitk in cutlass python? #1295

Comments

Miroier commented Jan 9, 2024

jackkosaian commented Jan 9, 2024

Miroier commented Jan 9, 2024

jackkosaian commented Jan 9, 2024 • edited Loading

github-actions bot commented Feb 9, 2024

jackkosaian commented Jan 9, 2024 •

edited

Loading