Feature Request: multiple simultaneous CLANG invocation. #130

emerth · 2020-01-29T17:52:59Z

Miopen runs clang to build object files that implement shaders that run layers or operations in ML/AI models. Which is fine. However it appears to run clang once, then run it once more, then once more, then... .

Logically the model graph should be available before forking out clang, and the shaders are object files, independent of each other, so it should be possible to count the cores in the host PC and then fork off that many clang instances.

Which leads to my feature request: parallelize the clang shader compilation code, make it invoke multiple clangs simultaneously.

daniellowell · 2020-04-11T02:43:28Z

The Find calls for convolution are now parallelized per algorithm. That is, each algorithm compiles the solvers' kernels in parallel. Expect future improvements in this area.

daniellowell assigned pfultz2 Jan 31, 2020

huanzhang12 mentioned this issue Mar 25, 2020

Performance comparsion: AMD with ROCm vs NVIDIA with cuDNN? ROCm/tensorflow-upstream#173

Open

daniellowell added the feature-request label Apr 11, 2020

daniellowell closed this as completed Apr 11, 2020

xinlipn mentioned this issue Feb 2, 2023

[tests] Fix bug in weights tensor layout in solver test #1950

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: multiple simultaneous CLANG invocation. #130

Feature Request: multiple simultaneous CLANG invocation. #130

emerth commented Jan 29, 2020 •

edited

Loading

daniellowell commented Apr 11, 2020

Feature Request: multiple simultaneous CLANG invocation. #130

Feature Request: multiple simultaneous CLANG invocation. #130

Comments

emerth commented Jan 29, 2020 • edited Loading

daniellowell commented Apr 11, 2020

emerth commented Jan 29, 2020 •

edited

Loading