[CT-1584] [Feature] New top level commands: interactive compile #6358

ChenyuLInx · 2022-12-02T01:34:08Z

Describe the feature(Updated Feb 9 after chatting with @aranke )

In dbt-rpc and dbt-server, we support using some kind of manifest object(in mem object for dbt-rpc and manifest.msgpack for dbt-server) that regenerated whenever a dbt project file got modified to to support interactive compile.
Here's the link of how it is done in dbt-rpc and dbt-server

For input options

One of the goals of creating such command is to get rid of the custom code we are doing in lib.py and dbt-server, and have dbt-server go through a proper interface in dbt-core for this functionalities. There are two main options we want to support

--allow-introspection/--no-allow-introspection (name tbd)
- see this comment for difference between those two
compile one existing model vs provide a --in-line(name tbd) option to allow user compile some code that's not attached to a specific model
- the in line code supplied by user doesn't belong to a specific node in the current project, so we will have to create a temp node for it, resolve the ref/source/config, then do the compilation.(link)
- for inline code things like {this} probably will be wrong
- for a existing model we don't need to create a separate temp node

For output

all commands should log to stdout, and return the compiled code via the result object
for compile one mode, we should update the file in target dir

Other non functional requirement

We should have functional test test the input options
We should remove the existing code linked above to make the previous mentioned function happen or create followup ticket for removing them if directly removing those would cause issue downstream.

The text was updated successfully, but these errors were encountered:

jtcohen6 · 2022-12-02T10:39:05Z

a compile by any other name

What do we think of creating this "interactive compile" as a modification / extension of the existing compile command?

$ dbt compile --code "select * from {{ ref('my_model') }}" --no-allow-introspection

Should the argument to --code be base64 encoded (same as dbt-rpc), to support a full range of characters?
How should dbt-core "return" the compiled code? My instinct here is, a log event that shows up in stdout (for CLI users) and can be programmatically parsed (for dbt-server). But we could find a way to do what dbt-server expects right now, which is returning the RemoteCompileResult as a Python object.
We'll want a flag/argument for allowing/disallowing introspective queries, to achieve this conditional behavior

There are a few important ways in which this would be different from the dbt compile command that exists today.
If those differences feel too big/confusing, let's create a totally new command instead (e.g. dbt compile_code).

Differences:

dbt compile --code is mutually exclusive with selection criteria (dbt compile --select/--exclude/--selector). If the --code argument is provided, dbt should skip over node selection entirely.
If dbt compile receives the --code argument, it should not populate the adapter cache. This is a behavior of standard dbt compile, but we don't want the added overhead for interactively compiling exactly this query. (Should this be another optional flag? We used to have a --bypass-cache/--no-use-cache flag before v1.0; we could add it back in.)

Similarities:

Under the hood, the GenericSqlRunner we're using for this today inherits from CompileRunner and operates mostly the same. The distinctions are around exception handling (seems fine, we could support this), and no-op'ing ephemeral models (? not sure when/how this would come up).
I think dbt compile --code would still want the ability to use --defer, though! Imagine that you've built some models in your dev schema, the rest are in prod, and you want to interactively compile a query that touches both:

select * from {{ ref('changed_model_in_dev') }}
union all
select * from {{ ref('unchanged_model_in_prod') }}

$ dbt compile --defer --state path/to/prod/artifacts/ --code "c2VsZWN0ICogZnJvbSB7eyByZWYoJ2NoYW5nZWRfbW9kZWxfaW5fZGV2JykgfX0KdW5pb24gYWxsCnNlbGVjdCAqIGZyb20ge3sgcmVmKCd1bmNoYW5nZWRfbW9kZWxfaW5fcHJvZCcpIH19"

select * from dbt_jcohen.changed_model_in_dev
union all
select * from analytics.unchanged_model_in_prod

re: manifests

IMO, we don't need to support a CLI --manifest flag, as a way to skip over parsing. For local runs, this is already the implicit behavior achieved via partial parsing (using target/partial_parse.msgpack). We should support passing in an existing manifest only when accessing the top-level commands programmatically, via dbt-core's Python API. And, in that sense, we should support it in the exact same way for every single command that uses a manifest.

how do we imagine update that manifest.msgpack when something changes but before user do another command that could generate that file? Should we just load the outdated ones? or should we try to do something like partial parsing to check project and determine whether we need a full parsing?

IMO this is not in scope for dbt-core. We can operate under the assumption that, if the user is modifying the manifest at the same time they're running commands, it's their responsibility to manage state for those manifests, and pass in the appropriate one to the appropriate command. In that case, the role of dbt-core is to support that concurrency by not mucking with globals, and ensuring that Manifest A being used by Command X won't be incidentally mutated by Command Y which wants to use Manifest B.

last but not least

As part of this work, we should also deprecate the existing "sql tasks": https://github.com/dbt-labs/dbt-core/blob/main/core/dbt/task/sql.py

jtcohen6 · 2022-12-05T09:47:29Z

Another consideration, not mentioned above: We should support interactive compilation of a specific model, within the context of that model, while still being able to skip over cache population. That will enable us to avoid this issue: dbt-labs/dbt-rpc#46

Think something like:

$ dbt compile --select specific_model --no-populate-cache --allow-introspection

That model's compiled SQL should then be included in the logs, and possibly also "returned" from the method (?) if called directly / programmatically.

ChenyuLInx · 2023-02-09T07:15:43Z

for --allow-introspection vs --no-allow-introspection(name tbd), here's how different runner implement it. Link for without connect to warehouse, link for being able to connect to warehouse. And how we switch between those two now link
The only difference seems to be removing line

with self.adapter.connection_for(self.node):

ChenyuLInx added enhancement New feature or request triage labels Dec 2, 2022

github-actions bot changed the title ~~[Feature] New top level commands: interactive compile~~ [CT-1584] [Feature] New top level commands: interactive compile Dec 2, 2022

This was referenced Dec 2, 2022

[CT-1581] [Epic] dbt-core as a library: first steps #6356

Closed

[CT-1585] [Feature] New top level commands: interactive preview #6359

Closed

dbeatty10 added Refinement Maintainer input needed and removed triage labels Dec 2, 2022

jtcohen6 added python_api Issues related to dbtRunner Python entry point cli Team:Execution and removed Refinement Maintainer input needed labels Dec 2, 2022

jtcohen6 mentioned this issue Jan 5, 2023

[CT-1751] Config to optionally skip population of relation cache #6526

Closed

leahwicz assigned aranke Jan 31, 2023

This was referenced Feb 19, 2023

[CT-1584] New top level commands: interactive compile #7008

Merged

Allow for local compile of incremental models. duneanalytics/spellbook#2403

Merged

aranke closed this as completed in #7008 Mar 11, 2023

This was referenced Apr 11, 2023

compile inline query doesn't add node #7325

Closed

compile inline query doesn't add node #7292

Closed

compile inline query doesn't add node #7326

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CT-1584] [Feature] New top level commands: interactive compile #6358

[CT-1584] [Feature] New top level commands: interactive compile #6358

ChenyuLInx commented Dec 2, 2022 •

edited

Loading

jtcohen6 commented Dec 2, 2022 •

edited

Loading

jtcohen6 commented Dec 5, 2022 •

edited

Loading

ChenyuLInx commented Feb 9, 2023 •

edited

Loading

[CT-1584] [Feature] New top level commands: interactive compile #6358

[CT-1584] [Feature] New top level commands: interactive compile #6358

Comments

ChenyuLInx commented Dec 2, 2022 • edited Loading

Describe the feature(Updated Feb 9 after chatting with @aranke )

For input options

For output

Other non functional requirement

jtcohen6 commented Dec 2, 2022 • edited Loading

a compile by any other name

re: manifests

last but not least

jtcohen6 commented Dec 5, 2022 • edited Loading

ChenyuLInx commented Feb 9, 2023 • edited Loading

ChenyuLInx commented Dec 2, 2022 •

edited

Loading

jtcohen6 commented Dec 2, 2022 •

edited

Loading

jtcohen6 commented Dec 5, 2022 •

edited

Loading

ChenyuLInx commented Feb 9, 2023 •

edited

Loading