Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reword or remove "Acquiring new ... connection" logline #3137

Closed
jtcohen6 opened this issue Mar 2, 2021 · 4 comments · Fixed by #4062
Closed

Reword or remove "Acquiring new ... connection" logline #3137

jtcohen6 opened this issue Mar 2, 2021 · 4 comments · Fixed by #4062
Labels
enhancement New feature or request performance

Comments

@jtcohen6
Copy link
Contributor

jtcohen6 commented Mar 2, 2021

Describe the feature

This line of code confuses a lot of folks because it's logged to logs/dbt.log:

https://github.com/fishtown-analytics/dbt/blob/ec0f3d22e792127e9c2c709f369e1faf8eb13c8d/core/dbt/adapters/base/connections.py#L138-L139

As worded, it's pretty misleading! I'm also not convinced it's accurate: it's logged by commands that don't strictly need database connections (e.g. dbt ls, dbt parse). This is relevant to our thinking around formally separating project parsing from compilation/execution, as a step that requires minimal adapter-specific logic and does not require a database (or Internet) connection.

Why fix?

This leads folks to think that:

  • dbt is opening dozens/hundreds of concurrent connections against their database
  • The slowness of project parsing is due to opening and closing these connections

This has cropped up in #2948, #3135, and several Slack threads, to name a few.

History

As best as I can tell, the reason for having it seems to be that, during project parsing, we may need an adapter connection, though I can't think of when/why we would:

https://github.com/fishtown-analytics/dbt/blob/344a14416d22f0cfbeb56b9904092c8a4f38b1fc/core/dbt/parser/base.py#L279-L289

Looking through older commits, it seems like we envisioned a future where this wouldn't be necessary. I'm not sure why it is necessary today.

https://github.com/fishtown-analytics/dbt/blob/b8febddad5f16e1cd02be8d8a00b3e8effb0e105/core/dbt/context/providers.py#L686-L699

@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

@github-actions github-actions bot added the stale Issues that have gone stale label Oct 14, 2021
@jtcohen6 jtcohen6 removed the stale Issues that have gone stale label Oct 14, 2021
@jtcohen6
Copy link
Contributor Author

I'd still like to fix this one. I think it could be a very quick quality-of-life improvement.

@leahwicz
Copy link
Contributor

@jtcohen6 would it make sense to do this with the structured logging work?

@jtcohen6
Copy link
Contributor Author

@leahwicz I think the issue here is that we're actually calling a whole connection method we simply don't need at parse time. The logging is a symptom of that.

There are many other times later on that we call the same method, with the same logging, and it makes sense to keep those around.

I took a stab at removing that with call entirely in #4062, and tests seem to be passing...

@jtcohen6 jtcohen6 removed their assignment Nov 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants