-
Notifications
You must be signed in to change notification settings - Fork 371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PR/BatchedAnalysis #1322
PR/BatchedAnalysis #1322
Conversation
…gen in order to build up relationships between operations and symbols controlling conditional control flow. Using these relationships and information on which shader globals are varing and which symbols renderer outputs, varying can be pushed down to just the symbols that will require it. Also analysis to determine which symbols can be forced in llvm to be boolean vs integers.
…:container::flat_set Replaced on std::unordered_set with a series of ustring comparisons (deemed faster to do 4 comparisions vs. binary search or hash lookup) Removed some extraneous using statements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ok as it is, though I think it would be better to use the flat maps/sets as I point out.
Now I will ramble.
I understand what you're doing here, but boy is this a dense section of code, with several classes and algorithms used only for this analysis.
The existing RuntimeOptimizer::track_variable_dependencies() is conceptually similar in that it does the basic "for a symbol, what are all the other symbols it depends on", and then uses the discovered dependencies to propagate information about which symbols need to carry around derivatives. I know that this here is a different and more complex task, because of the way that conditionals and loops can make assignments inside them varying even though they might be appearing to assign uniform variables.
But I look at t_v_d and how straightforward it is, how somebody not familiar with it can very quickly understand it fully, convince themselves that it's likely correct (though admittedly conservative, rather than strictly optimal), and very easily extend it if there were new language constructs. (Aside: I really thought that somewhere in the history of this project, we HAD a full analysis of what needs to be varying, but I can't seem to find it now. Maybe we didn't.)
I can't help but worry that we've made this code so complex that if a couple years from now it needs to be debugged or modified by somebody other than Alex, they might be tempted to think it's less work to totally rewrite it than to figure it out.
This is done and working, let's accept it. But at some point in the future, I'm tempted to invest a day in making a big #if
that lets us switch between this and a drastically simpler conservative varying analysis (of similar straightforwardness to t_v_d) and see if there is enough of a perf difference to justify keeping the complex code.
Updates made based on review feedback. Tested with Marble.osl and dumped the results:
8 of 31 symbols are VARYING |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the revisions you added, LGTM
Introduce BatchedAnalysis which processes operations similar to code gen in order to build up relationships between operations and symbols controlling conditional control flow. Using these relationships and information on which shader globals are varing and which symbols renderer outputs, varying can be pushed down to just the symbols that will require it. Also analysis to determine which symbols can be forced in llvm to be boolean vs integers.
The idea is simple, but understanding & managing the relationships is complex, however it is entirely encapsulated in BatchedAnalysis.cpp. The main goal is to build up a multimap "m_symbols_dependent_upon" which maps a key symbol to all other symbols that due to operations, conditions, loops, function calls, early out control flow (break, continue, exit) would be forced to be varying if that key symbol where varying. We can use this map to recursively push varying of symbols down from shader globals, renderer outputs, and operations whose results are implicitly varying.
Several custom data/acceleration structures have been created to efficiently model these relations ships with their key feature being that their iterators are never invalidated as additional items are pushed and popped such that one can always iterate from a historical position back to the root:
DependencyTreeTracker, FunctionTreeTracker. Additonally LoopStack tracks loop nesting and aids in detecting cyclical reads.
Analysis is also performed to identify which symbols are read at a different conditional scope than they were written to and might require masking. Symbols written to and read from within the same (or deeper nesting) conditional scope do not require masking. A conditional scope is defined as the stack of of conditions evaluated to reach an operation and is managed by the DependencyTreeTracker.
When adding new ops, or when changing implementation or behavior of
existing ops, the following functions should be evaluated to see if
they need to be updated to handle to handle the new op or change:
bool are_op_results_always_implicitly_varying(ustring opname);
bool are_op_results_always_implicitly_uniform(ustring opname);
bool does_op_implementation_require_masking(ustring opname);
SymbolPtrVec * protected_shader_globals_op_implicitly_depends_on(ShaderInstance &inst, Opcode & opcode);
If there is a new or change in control flow op's are made,
a more detailed evaluation will be needed as their behavior
is hard coded vs. lookup based implementation of the
functions above.
RuntimeOptimizer uses BatchedAnalysis before it coalesces temporaries so that varying and uniform symbols do not get merged.
Description
Tests
Checklist: