perf: optimise solver graph evaluation loop #6728
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
This fix makes the
loop
method in the solver much more efficient for large graphs (lots of actions and dependencies).While graph evaluation in the solver isn't usually the performance bottleneck, the performance impact can be noticeable in larger projects.
Here, we use a generator to get leaf nodes ready for processing (whereas previously, we'd traverse the whole pending graph on each loop iteration and load it node-by-node into a new dependency graph data structure, which was costly when iterating over a large graph in a tight loop).
We also used a simple boolean flag (dirty) in the solver to track whether any tasks had been requested or completed (successfully or with an error) since the last loop. This was used to avoid evaluating the graph when we know it's not going to result in any changes to the in-progress or pending graphs, or in any new nodes being scheduled for processing.
There are further optimizations we can make to the solver (see the comments added in this commit), but they would require careful implementation and testing to avoid possible regressions, so we're leaving them be for now.
In a test project that was specifically designed to get the solver to struggle (lots of actions with lots of dependencies), fully resolving the graph before this change took 3 minutes on my laptop, whereas it's down to 51 seconds after these changes. I suspect it could be brought down to 20-30 seconds with the further optimizations outlined in the added code comments, but this might be enough for the time being.
Special notes for your reviewer:
Note to @stefreak and @eysi09: If you try these changes on the slow, generated project that Eythor set up, you should be able to repro the performance improvement I'm describing above.