Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#4012] Performance: Use child_map to find tests for nodes in resolve_graph #4022

Merged
merged 1 commit into from
Oct 13, 2021

Conversation

gshank
Copy link
Contributor

@gshank gshank commented Oct 7, 2021

resolves #4012

Description

The resolve_graph method that was implemented to support the build command caused a performance regression. This pull request switches to using the child_map to find tests for nodes.

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

@gshank gshank requested a review from kwigley October 7, 2021 17:17
@cla-bot cla-bot bot added the cla:yes label Oct 7, 2021
@gshank gshank requested review from jtcohen6 and leahwicz October 7, 2021 17:17
@gshank
Copy link
Contributor Author

gshank commented Oct 7, 2021

The parent_map and child_map are not full-fledged members of the manifest class. Originally they were only created for writing out to the manifest.json. I needed some of the information in partial parsing so I built them when partially parsing. I've added another 'build_parent_and_child_maps' all to compilation.py. We ought to do some cleanup here and make them full-fledged attributes, but we'd also need to decide what to do about serializing and where to build them, etc, so I'm putting this pull request out for review without that extra work.

@jtcohen6
Copy link
Contributor

jtcohen6 commented Oct 7, 2021

I've tried this a bunch locally, and I haven't yet been able to break it. It's much, much faster.

I would like to see us add some more complex test cases (namely, tests with multiple parents) to the 069_build_test integration suite.

@gshank gshank force-pushed the resolve_graph_perf branch from bb89fc9 to 2ca1f35 Compare October 8, 2021 16:48
@gshank gshank force-pushed the resolve_graph_perf branch from 2ca1f35 to 5e332f7 Compare October 13, 2021 20:42
@gshank gshank merged commit fd7c95d into main Oct 13, 2021
@gshank gshank deleted the resolve_graph_perf branch October 13, 2021 21:03
@isaacsantelli
Copy link

isaacsantelli commented Oct 13, 2021

@gshank I submitted the original issue, this is the first time i've submitted a dbt issue, should I expect that the change will be in the dbt 0.22.0 or in an earlier build? Also thank you so much for the quick resolution either way :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Large projects see significant slowdown in resolve_graph using v0.21.0
4 participants