perf: optimise solver graph evaluation loop #6728

thsig · 2024-12-17T19:51:11Z

What this PR does / why we need it:

This fix makes the loop method in the solver much more efficient for large graphs (lots of actions and dependencies).

While graph evaluation in the solver isn't usually the performance bottleneck, the performance impact can be noticeable in larger projects.

Here, we use a generator to get leaf nodes ready for processing (whereas previously, we'd traverse the whole pending graph on each loop iteration and load it node-by-node into a new dependency graph data structure, which was costly when iterating over a large graph in a tight loop).

We also used a simple boolean flag (dirty) in the solver to track whether any tasks had been requested or completed (successfully or with an error) since the last loop. This was used to avoid evaluating the graph when we know it's not going to result in any changes to the in-progress or pending graphs, or in any new nodes being scheduled for processing.

There are further optimizations we can make to the solver (see the comments added in this commit), but they would require careful implementation and testing to avoid possible regressions, so we're leaving them be for now.

In a test project that was specifically designed to get the solver to struggle (lots of actions with lots of dependencies), fully resolving the graph before this change took 3 minutes on my laptop, whereas it's down to 51 seconds after these changes. I suspect it could be brought down to 20-30 seconds with the further optimizations outlined in the added code comments, but this might be enough for the time being.

Special notes for your reviewer:

Note to @stefreak and @eysi09: If you try these changes on the slow, generated project that Eythor set up, you should be able to repro the performance improvement I'm describing above.

This fix makes the `loop` method in the solver much more efficient for large graphs (lots of actions and dependencies). While graph evaluation in the solver isn't usually the performance bottleneck, the performance impact can be noticeable in larger projects. Here, we use a generator to get leaf nodes ready for processing (whereas previously, we'd traverse the whole pending graph on each loop iteration and load it node-by-node into a new dependency graph data structure, which was costly when iterating over a large graph in a tight loop). We also used a simple boolean flag (dirty) in the solver to track whether any tasks had been requested or completed (successfully or with an error) since the last loop. This was used to avoid evaluating the graph when we know it's not going to result in any changes to the in-progress or pending graphs, or in any new nodes being scheduled for processing. There are further optimizations we can make to the solver (see the comments added in this commit), but they would require careful implementation and testing to avoid possible regressions, so we're leaving them be for now. In a test project that was specifically designed to get the solver to struggle (lots of actions with lots of dependencies), fully resolving the graph before this change took 3 minutes on my laptop, whereas it's down to 55 seconds after these changes. I suspect it could be brought down to 20-30 seconds with the further optimizations outlined in the added code comments, but this might be enough for the time being.

stefreak

70% faster is a big improvement! Great work 👏

thsig requested review from stefreak and eysi09 December 17, 2024 19:51

thsig force-pushed the optimize-get-pending-graph branch from 0035643 to 03a946e Compare December 17, 2024 20:02

Merge branch 'main' into optimize-get-pending-graph

7c537ca

stefreak approved these changes Dec 18, 2024

View reviewed changes

stefreak added this pull request to the merge queue Dec 18, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 18, 2024

vvagaytsev added this pull request to the merge queue Dec 18, 2024

Merged via the queue into main with commit bd6c6ba Dec 18, 2024
40 checks passed

vvagaytsev deleted the optimize-get-pending-graph branch December 18, 2024 18:16

stefreak mentioned this pull request Jan 9, 2025

chore(release): bump version to 0.13.48 #6752

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: optimise solver graph evaluation loop #6728

perf: optimise solver graph evaluation loop #6728

thsig commented Dec 17, 2024 •

edited

Loading

stefreak left a comment

perf: optimise solver graph evaluation loop #6728

perf: optimise solver graph evaluation loop #6728

Conversation

thsig commented Dec 17, 2024 • edited Loading

stefreak left a comment

Choose a reason for hiding this comment

thsig commented Dec 17, 2024 •

edited

Loading