-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propagate docs to the database #1031
Comments
Since the BQ description field is just a giant text blob, I'd suggest putting the raw markdown description in there. I'd also love to see the raw (uncompiled) query appended after the description in the same field. Then, folks would have some documentation even if there were no description provided for the table. Plus, devs tend to document their SQL in the .sql file itself. If we included both, we'd be getting decent documentation in the native BQ UI, with minimal documentation effort. |
Got it, thanks for the report @jakebiesinger! The big challenge here is that we need to make this work for BigQuery, Redshift, Postgres, and Snowflake. Table and column comments are supported for all of these databases, so I don't view that as a problem, but it's a decent amount of work! When do you see these descriptions being persisted? Is it when the models are built? Or via some other process, like |
It'd be awesome if this happened as part of the normal
|
This feature would be very helpful. I'd be more than happy to contribute to this if we can decide on a solution 👍 |
Cool idea @darrenhaken. To summarize, there's a matrix of comments that dbt should support to the extent supported by the underlying databases:
Doing this correctly will require knowledge of and access to each of these databases. For the more SQL compatible ones (snowflake, redshift, postgres), we can issue an
And for BigQuery, I think we can supply a value for the object description. I'm not sure how best to surface schema-level descriptions, as these aren't currently supported in dbt's documentation. Let's leave that out for the first cut, but we should consider ways to support them in dbt in the near future. @darrenhaken you mentioned using |
@drewbanin I think the dbt's tags could be used for |
Re BigQuery - I would propose focusing on object description as a first iteration of expanding out the matrix. |
cool, i agree, let's start with bigquery table/view descriptions. I think those can be embedded into the In practice, that looks something like:
The macro that generates You can see where dbt actually calls this We do something similar for views and incremental models in the corresponding materializations as well. This materialization code is kind of a mess and in need of a cleanup, but I think this is where all of the magic will likely need to happen. |
@drewbanin the links you shared create 404 pages, is that because you shared your branch and not master? I'd be keen to start development on this if we agree on an approach, thoughts? |
Thanks @darrenhaken - just updated my comment to use permalinks. Database-specific implementation questions aside, I think you're right that we just need to agree upon a syntax for specifying docs persistence. Let's add a config option to models called
Let's definitely only implement this for I think that for the first cut of this, dbt should only update docs when models are rebuilt. While I do think there's merit to updating documentation without re-running models, that's also a lot more involved. Lmk if you buy all of this. I imagine you'll need some more info (like how to get documentation for a relation inside of a materialization), but very happy to help you with that in turn once we're all set on the implementation. |
@drewbanin trying to get up to speed with all the Could you show a dummy model file containing the Or is the idea to use the |
hey @darrenhaken - I'm picturing something like:
Because dbt's configs are hierarchical, you can also configure all of your model to use this setting in your
I absolutely think dbt should use the |
Thanks for all the info! That all makes complete sense and the approach sounds good to me. Can you point me in the right direction which files in the codebase need to be modified to support it, I assume Python? |
@darrenhaken just out the links to the code above. I don't know that any python changes will be required here (at least for the first cut). You can find the description for a model in the materialization context with: You can find the value for a config (like
Give the BigQuery links above a once-over and lmk if you have any other questions! |
I’ll try and take a look at this week in more detail |
I'm picking this up now, sorry for the delay. @drewbanin could you keep an eye on this issue as I'd love some feedback on it? |
@drewbanin what's the correct way to set the
Got any example? |
@drewbanin I worked this out, see PR |
@drewbanin @darrenhaken @jakebiesinger Super psyched about this PR: #1285 BigQuery's DDL also supports:
This means these could be inserted via a macro from the schema.yml as well: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language This looks something like:
I think this is on your radar, but we have a very similar use case to @jakebiesinger and having these docs persist through to BQ would be really great. Thanks, |
Great to hear! Yes, I also wanted the labels/tags and column descriptions. If you have time feel free to contribute beyond this work to add them in! If not I’m going to try and get the time over the next month to do it but help would be ideal. |
@gturetsky thanks for the writeup! Yeah - this should be totally possible. Should be straightforward-ish to extend #1285 for this use case :) |
This was (finally) merged in #1532 :D Thanks for your hard work on the original PR @darrenhaken! |
Hurray!
…On Thu, 13 Jun 2019 at 18:00, Drew Banin ***@***.***> wrote:
This was (finally) merged in #1532
<#1532> :D
Thanks for your hard work on the original PR @darrenhaken
<https://github.com/darrenhaken>!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1031>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAHIY63NE3RVFGULZIIHS33P2J4MRANCNFSM4FXTH4RA>
.
|
Hey, this seems awesome! I'm having some trouble to find any documentation about this though and just writing documentation in a yaml file doesn't seem to work. Anyone have some pointers 🙂 |
@Jaxing Check out these docs: https://docs.getdbt.com/reference/resource-configs/persist_docs |
Thanks for the quick reply! It worked 🙂 |
Feature
We should propagate docs to BQ table and column descriptions.
Feature description
Cool that dbt generates a web page and docs for your project. Unfortunately, it's a bit removed from where our org is used to looking for documentation. We use BQ extensively and rely on the table and column descriptions provided in the BQ UI.
Who will this benefit?
Anyone using BQ would benefit.
The text was updated successfully, but these errors were encountered: