You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Miopen runs clang to build object files that implement shaders that run layers or operations in ML/AI models. Which is fine. However it appears to run clang once, then run it once more, then once more, then... .
Logically the model graph should be available before forking out clang, and the shaders are object files, independent of each other, so it should be possible to count the cores in the host PC and then fork off that many clang instances.
Which leads to my feature request: parallelize the clang shader compilation code, make it invoke multiple clangs simultaneously.
The text was updated successfully, but these errors were encountered:
The Find calls for convolution are now parallelized per algorithm. That is, each algorithm compiles the solvers' kernels in parallel. Expect future improvements in this area.
Miopen runs clang to build object files that implement shaders that run layers or operations in ML/AI models. Which is fine. However it appears to run clang once, then run it once more, then once more, then... .
Logically the model graph should be available before forking out clang, and the shaders are object files, independent of each other, so it should be possible to count the cores in the host PC and then fork off that many clang instances.
Which leads to my feature request: parallelize the clang shader compilation code, make it invoke multiple clangs simultaneously.
The text was updated successfully, but these errors were encountered: