Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: optimise solver graph evaluation loop #6728

Merged
merged 2 commits into from
Dec 18, 2024

Conversation

thsig
Copy link
Collaborator

@thsig thsig commented Dec 17, 2024

What this PR does / why we need it:

This fix makes the loop  method in the solver much more efficient for large graphs (lots of actions and dependencies).

While graph evaluation in the solver isn't usually the performance bottleneck, the performance impact can be noticeable in larger projects.

Here, we use a generator to get leaf nodes ready for processing (whereas previously, we'd traverse the whole pending graph on each loop iteration and load it node-by-node into a new dependency graph data structure, which was costly when iterating over a large graph in a tight loop).

We also used a simple boolean flag (dirty) in the solver to track whether any tasks had been requested or completed (successfully or with an error) since the last loop. This was used to avoid evaluating the graph when we know it's not going to result in any changes to the in-progress or pending graphs, or in any new nodes being scheduled for processing.

There are further optimizations we can make to the solver (see the comments added in this commit), but they would require careful implementation and testing to avoid possible regressions, so we're leaving them be for now.

In a test project that was specifically designed to get the solver to struggle (lots of actions with lots of dependencies), fully resolving the graph before this change took 3 minutes on my laptop, whereas it's down to 51 seconds after these changes. I suspect it could be brought down to 20-30 seconds with the further optimizations outlined in the added code comments, but this might be enough for the time being.

Special notes for your reviewer:

Note to @stefreak and @eysi09: If you try these changes on the slow, generated project that Eythor set up, you should be able to repro the performance improvement I'm describing above.

@thsig thsig requested review from stefreak and eysi09 December 17, 2024 19:51
This fix makes the `loop`  method in the solver much more efficient for
large graphs (lots of actions and dependencies).

While graph evaluation in the solver isn't usually the performance
bottleneck, the performance impact can be noticeable in larger projects.

Here, we use a generator to get leaf nodes ready for processing (whereas
previously, we'd traverse the whole pending graph on each loop iteration
and load it node-by-node into a new dependency graph data structure,
which was costly when iterating over a large graph in a tight loop).

We also used a simple boolean flag (dirty) in the solver to track
whether any tasks had been requested or completed (successfully or with
an error) since the last loop. This was used to avoid evaluating the
graph when we know it's not going to result in any changes to the
in-progress or pending graphs, or in any new nodes being scheduled for
processing.

There are further optimizations we can make to the solver (see the
comments added in this commit), but they would require careful
implementation and testing to avoid possible regressions, so we're
leaving them be for now.

In a test project that was specifically designed to get the solver to
struggle (lots of actions with lots of dependencies), fully resolving
the graph before this change took 3 minutes on my laptop, whereas it's
down to 55 seconds after these changes. I suspect it could be brought
down to 20-30 seconds with the further optimizations outlined in
the added code comments, but this might be enough for the time being.
@thsig thsig force-pushed the optimize-get-pending-graph branch from 0035643 to 03a946e Compare December 17, 2024 20:02
Copy link
Member

@stefreak stefreak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

70% faster is a big improvement! Great work 👏

@stefreak stefreak added this pull request to the merge queue Dec 18, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 18, 2024
@vvagaytsev vvagaytsev added this pull request to the merge queue Dec 18, 2024
Merged via the queue into main with commit bd6c6ba Dec 18, 2024
40 checks passed
@vvagaytsev vvagaytsev deleted the optimize-get-pending-graph branch December 18, 2024 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants