Low level optimization for GPU? #2395
Labels
advanced optimization
The issue or bug is related to advanced optimization
discussion
Welcome discussion!
feature request
Suggest an idea on this project
Concisely describe the proposed feature
Low level optimization to utilize hardware's instructions like mad, rcp, log2, exp2 which are only one-cycle, as well as use scalar instead of vector etc.
for example write
(x + 3.0f) * 1.5f
gives two instruction (add and mul) butx * 1.5f + 4.5f
give one instruction (mad)Although it highly depends on hardware, mostly it's beneficial to use mad style instruction on GPU.
Doing so, according to report[1], could have ~7% performance improvement on shaders.
However, traditionally it was graphic programmers do the job manually, as report[1] claims "Compiler can't change operation semantics so it cant optimize for you". If compiler can do such kind of optimization, it will also save programmer's time.
Describe the solution you'd like (if any)
I see two ways,
a. Add those operators into IR, and use some algorithm optimize semantic
b. Generate optimized code through backend codegen.
For a, I am not a little bit familiar with compiler so no suggestion I can give. For b, I doubt the viability as it's build from IR.
Additional comments
References:
[1] Low-Level Shader Optimization for Next-Gen and DX11, GDC2014, slides
[2] Low-Level GLSL Optimisation for PowerVR
The text was updated successfully, but these errors were encountered: