Reduce memory consumption in docs generation #2009

drewbanin · 2019-12-16T21:35:46Z

Describe the feature

The dbt docs generate command fetches all of the columns in all of the source and target schemas touched by dbt. As such, even moderately sized dbt projects can have catalogs containing hundreds of thousands of columns. In practice, we see that the dbt docs generate command can consume hundreds of megabytes of memory which causes challenges for tools which orchestrate dbt.

Let's do some profiling here to gauge the memory usage patterns of the dbt docs generate command and inspect parts of the codebase that might be responsible for ballooning memory usage. If there are any straightforward changes we can make to the dbt codebase to reduce the memory footprint of this command, we should try to implement them.

Describe alternatives you've considered

Download more RAM

Additional context

This doesn't appear to be database specific. The absolute numbers aren't so useful here, but I have seen a report of a catalog query which returns 140k records consuming over 600mb of memory. Some of the Agate operations that happen on the dataframe returned by the database might be likely culprits here.

The text was updated successfully, but these errors were encountered:

…lake-catalogs Feature: faster snowflake catalogs (#2009)

drewbanin added enhancement New feature or request performance labels Dec 16, 2019

drewbanin added this to the Barbara Gittings milestone Dec 16, 2019

beckjake mentioned this issue Jan 9, 2020

Feature: faster snowflake catalogs (#2009) #2037

Merged

beckjake closed this as completed in #2037 Feb 6, 2020

beckjake added a commit that referenced this issue Feb 6, 2020

Merge pull request #2037 from fishtown-analytics/feature/faster-snowf…

0df49c5

…lake-catalogs Feature: faster snowflake catalogs (#2009)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce memory consumption in docs generation #2009

Reduce memory consumption in docs generation #2009

drewbanin commented Dec 16, 2019

Reduce memory consumption in docs generation #2009

Reduce memory consumption in docs generation #2009

Comments

drewbanin commented Dec 16, 2019

Describe the feature

Describe alternatives you've considered

Additional context