You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The dbt docs generate command fetches all of the columns in all of the source and target schemas touched by dbt. As such, even moderately sized dbt projects can have catalogs containing hundreds of thousands of columns. In practice, we see that the dbt docs generate command can consume hundreds of megabytes of memory which causes challenges for tools which orchestrate dbt.
Let's do some profiling here to gauge the memory usage patterns of the dbt docs generate command and inspect parts of the codebase that might be responsible for ballooning memory usage. If there are any straightforward changes we can make to the dbt codebase to reduce the memory footprint of this command, we should try to implement them.
Describe alternatives you've considered
Download more RAM
Additional context
This doesn't appear to be database specific. The absolute numbers aren't so useful here, but I have seen a report of a catalog query which returns 140k records consuming over 600mb of memory. Some of the Agate operations that happen on the dataframe returned by the database might be likely culprits here.
The text was updated successfully, but these errors were encountered:
Describe the feature
The
dbt docs generate
command fetches all of the columns in all of the source and target schemas touched by dbt. As such, even moderately sized dbt projects can have catalogs containing hundreds of thousands of columns. In practice, we see that thedbt docs generate
command can consume hundreds of megabytes of memory which causes challenges for tools which orchestrate dbt.Let's do some profiling here to gauge the memory usage patterns of the
dbt docs generate
command and inspect parts of the codebase that might be responsible for ballooning memory usage. If there are any straightforward changes we can make to the dbt codebase to reduce the memory footprint of this command, we should try to implement them.Describe alternatives you've considered
Download more RAM
Additional context
This doesn't appear to be database specific. The absolute numbers aren't so useful here, but I have seen a report of a catalog query which returns 140k records consuming over 600mb of memory. Some of the Agate operations that happen on the dataframe returned by the database might be likely culprits here.
The text was updated successfully, but these errors were encountered: