PR/BatchedAnalysis #1322

AlexMWells · 2021-01-06T20:49:24Z

Introduce BatchedAnalysis which processes operations similar to code gen in order to build up relationships between operations and symbols controlling conditional control flow. Using these relationships and information on which shader globals are varing and which symbols renderer outputs, varying can be pushed down to just the symbols that will require it. Also analysis to determine which symbols can be forced in llvm to be boolean vs integers.

The idea is simple, but understanding & managing the relationships is complex, however it is entirely encapsulated in BatchedAnalysis.cpp. The main goal is to build up a multimap "m_symbols_dependent_upon" which maps a key symbol to all other symbols that due to operations, conditions, loops, function calls, early out control flow (break, continue, exit) would be forced to be varying if that key symbol where varying. We can use this map to recursively push varying of symbols down from shader globals, renderer outputs, and operations whose results are implicitly varying.
Several custom data/acceleration structures have been created to efficiently model these relations ships with their key feature being that their iterators are never invalidated as additional items are pushed and popped such that one can always iterate from a historical position back to the root:
DependencyTreeTracker, FunctionTreeTracker. Additonally LoopStack tracks loop nesting and aids in detecting cyclical reads.

Analysis is also performed to identify which symbols are read at a different conditional scope than they were written to and might require masking. Symbols written to and read from within the same (or deeper nesting) conditional scope do not require masking. A conditional scope is defined as the stack of of conditions evaluated to reach an operation and is managed by the DependencyTreeTracker.

When adding new ops, or when changing implementation or behavior of
existing ops, the following functions should be evaluated to see if
they need to be updated to handle to handle the new op or change:
bool are_op_results_always_implicitly_varying(ustring opname);
bool are_op_results_always_implicitly_uniform(ustring opname);
bool does_op_implementation_require_masking(ustring opname);
SymbolPtrVec * protected_shader_globals_op_implicitly_depends_on(ShaderInstance &inst, Opcode & opcode);

If there is a new or change in control flow op's are made,
a more detailed evaluation will be needed as their behavior
is hard coded vs. lookup based implementation of the
functions above.

RuntimeOptimizer uses BatchedAnalysis before it coalesces temporaries so that varying and uniform symbols do not get merged.

Description

Tests

Checklist:

I have read the contribution guidelines.
I have previously submitted a Contributor License Agreement.
I have updated the documentation, if applicable.
I have ensured that the change is tested somewhere in the testsuite (adding new test cases if necessary).
My code follows the prevailing code style of this project.

…gen in order to build up relationships between operations and symbols controlling conditional control flow. Using these relationships and information on which shader globals are varing and which symbols renderer outputs, varying can be pushed down to just the symbols that will require it. Also analysis to determine which symbols can be forced in llvm to be boolean vs integers.

…:container::flat_set Replaced on std::unordered_set with a series of ustring comparisons (deemed faster to do 4 comparisions vs. binary search or hash lookup) Removed some extraneous using statements

lgritz

It's ok as it is, though I think it would be better to use the flat maps/sets as I point out.

Now I will ramble.

I understand what you're doing here, but boy is this a dense section of code, with several classes and algorithms used only for this analysis.

The existing RuntimeOptimizer::track_variable_dependencies() is conceptually similar in that it does the basic "for a symbol, what are all the other symbols it depends on", and then uses the discovered dependencies to propagate information about which symbols need to carry around derivatives. I know that this here is a different and more complex task, because of the way that conditionals and loops can make assignments inside them varying even though they might be appearing to assign uniform variables.

But I look at t_v_d and how straightforward it is, how somebody not familiar with it can very quickly understand it fully, convince themselves that it's likely correct (though admittedly conservative, rather than strictly optimal), and very easily extend it if there were new language constructs. (Aside: I really thought that somewhere in the history of this project, we HAD a full analysis of what needs to be varying, but I can't seem to find it now. Maybe we didn't.)

I can't help but worry that we've made this code so complex that if a couple years from now it needs to be debugged or modified by somebody other than Alex, they might be tempted to think it's less work to totally rewrite it than to figure it out.

This is done and working, let's accept it. But at some point in the future, I'm tempted to invest a day in making a big #if that lets us switch between this and a drastically simpler conservative varying analysis (of similar straightforwardness to t_v_d) and see if there is enough of a perf difference to justify keeping the complex code.

src/liboslexec/batched_analysis.h

src/liboslexec/batched_analysis.cpp

AlexMWells · 2021-01-13T01:20:35Z

Updates made based on review feedback.

Tested with Marble.osl and dumped the results:

Emit Symbol uniformity
--->0x39fffe0 Cin is UNIFORM
--->0x3a00048 freq is UNIFORM
--->0x3a000b0 Cout is VARYING
--->0x3a00118 P is VARYING
--->0x3a00180 sum is VARYING
--->0x3a001e8 freqVal is UNIFORM
--->0x3a00250 Pshad is VARYING
--->0x3a002b8 ___330_i is UNIFORM
--->0x3a00320 $const1 is UNIFORM
--->0x3a00388 $const2 is UNIFORM
--->0x3a003f0 $const3 is UNIFORM
--->0x3a00458 $const4 is UNIFORM
--->0x3a004c0 $const5 is UNIFORM
--->0x3a00528 $tmp1 is UNIFORM
--->0x3a00590 $tmp2 is UNIFORM
--->0x3a005f8 $const6 is UNIFORM
--->0x3a00660 $tmp3 is UNIFORM
--->0x3a006c8 $const7 is UNIFORM
--->0x3a00730 $tmp4 is VARYING
--->0x3a00798 $const8 is UNIFORM
--->0x3a00800 $tmp5 is VARYING
--->0x3a00868 $tmp6 is UNIFORM
--->0x3a008d0 $const10 is UNIFORM
--->0x3a00938 $tmp7 is VARYING
--->0x3a009a0 $tmp8 is VARYING
--->0x3a00a08 $tmp9 is VARYING
--->0x3a00a70 $const12 is UNIFORM
--->0x3a00ad8 $tmp10 is UNIFORM
--->0x3a00b40 $newconst0 is UNIFORM
--->0x3a00ba8 $newconst1 is UNIFORM
--->0x3a00c10 $newconst2 is UNIFORM
done with Symbol uniformity
Emit masking requirements
---> inst#8 op=neq requires MASKING
---> inst#16 op=add requires MASKING
---> inst#19 op=add requires MASKING
---> inst#20 op=mul requires MASKING
done with masking requirements
Emit analysis_flag for getattribute
done with analysis_flag for getattribute
Emit analysis_flag for loops with continue
done with analysis_flag for loops with continue
Emit Symbols forced to llvm bool
--->0x3a00528 $tmp1 is forced_llvm_bool
--->0x3a00590 $tmp2 is forced_llvm_bool
done with Symbols forced to llvm bool

8 of 31 symbols are VARYING
4 of 21 ops require masking
2 of 31 symbols can be forced to llvm bool
No getattribute calls were uniform
No loops with continue

lgritz

With the revisions you added, LGTM

AlexMWells added 2 commits February 18, 2016 14:53

Improve BatchedAnalysis by changing some std::unordered_set to boost:…

09e6ab0

…:container::flat_set Replaced on std::unordered_set with a series of ustring comparisons (deemed faster to do 4 comparisions vs. binary search or hash lookup) Removed some extraneous using statements

lgritz approved these changes Jan 11, 2021

View reviewed changes

src/liboslexec/batched_analysis.h Outdated Show resolved Hide resolved

src/liboslexec/batched_analysis.cpp Outdated Show resolved Hide resolved

src/liboslexec/batched_analysis.cpp Outdated Show resolved Hide resolved

src/liboslexec/batched_analysis.cpp Outdated Show resolved Hide resolved

lgritz approved these changes Jan 19, 2021

View reviewed changes

lgritz merged commit 79c1a20 into AcademySoftwareFoundation:master Jan 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PR/BatchedAnalysis #1322

PR/BatchedAnalysis #1322

AlexMWells commented Jan 6, 2021 •

edited

Loading

lgritz left a comment

AlexMWells commented Jan 13, 2021

lgritz left a comment

PR/BatchedAnalysis #1322

PR/BatchedAnalysis #1322

Conversation

AlexMWells commented Jan 6, 2021 • edited Loading

Description

Tests

Checklist:

lgritz left a comment

Choose a reason for hiding this comment

AlexMWells commented Jan 13, 2021

lgritz left a comment

Choose a reason for hiding this comment

AlexMWells commented Jan 6, 2021 •

edited

Loading