diff --git a/website/docs/docs/guides/migration-guide/upgrading-to-0-18-0.md b/website/docs/docs/guides/migration-guide/upgrading-to-0-18-0.md new file mode 100644 index 00000000000..463ae668104 --- /dev/null +++ b/website/docs/docs/guides/migration-guide/upgrading-to-0-18-0.md @@ -0,0 +1,31 @@ +--- +title: "Upgrading to 0.18.0 (prerelease)" + +--- + +:::info Prerelease + +dbt v0.18.0 is currently in beta. Please post in the dbt Slack #prereleases channel +if you uncover any bugs or issues. + +::: + +dbt v0.18.0 introduces several new features around model selection. + +## Articles: + + - [Changelog](https://github.com/fishtown-analytics/dbt/blob/dev/marian-anderson/CHANGELOG.md) + +## Breaking changes + +Please be aware of the following changes in v0.18.0. While breaking, we do not expect these to affect the majority of projects. + +### Adapter macros + +* Previously, dbt put macros from all installed plugins into the namespace. This version of dbt will not include adapter plugin macros unless they are from the currently-in-use adapter or one of its dependencies. + +## New and changed documentation + +**Core** +- [model selection syntax](model-selection-syntax) +- [`dbt ls`](commands/list) diff --git a/website/docs/reference/commands/list.md b/website/docs/reference/commands/list.md index abf1dc7f397..3bd7eaf61f0 100644 --- a/website/docs/reference/commands/list.md +++ b/website/docs/reference/commands/list.md @@ -11,43 +11,55 @@ The `dbt ls` command lists resources in your dbt project. It accepts selector ar ``` dbt ls [--resource-type {source,analysis,model,snapshot,test,seed,default,all}] - [--select SELECTOR [SELECTOR ...]] + [--select SELECTION_ARG [SELECTION_ARG ...]] [--models SELECTOR [SELECTOR ...]] [--exclude SELECTOR [SELECTOR ...]] + [--selector YML_SELECTOR_NAME [YML_SELECTOR_NAME ...]] [--output {json,name,path,selector}] ``` +See [resource selection syntax](model-selection-syntax) for more information on how to select resources in dbt + **Arguments**: - `--resource-type`: This flag limits the "resource types" that dbt will return in the `dbt ls` command. By default, the following resources are included in the results of `dbt ls`: models, snapshots, seeds, tests, and sources. -- `--select`: This flag specifies one or more "selectors" used to filter the nodes returned by the `dbt ls` command. See the docs on the [resource selection syntax](model-selection-syntax) for more information on selecting resources in dbt +- `--select`: This flag specifies one or more selection-type arguments used to filter the nodes returned by the `dbt ls` command - `--models`: Like the `--select` flag, this flag is used to select nodes. It implies `--resource-type=model`, and will only return models in the results of the `dbt ls` command. - `--exclude`: Specify selectors that should be _excluded_ from the list of returned nodes. +- `--selector`: This flag specifies one or more named selectors, defined in a `selectors.yml` file. - `--output`: This flag controls the format of output from the `dbt ls` command. Note that the `dbt ls` command does not include models which are disabled or schema tests which depend on models which are disabled. All returned resources will have a `config.enabled` value of `true`. ### Example usage -**Listing models by selector** +**Listing models by package** ``` $ dbt ls --models snowplow.* -model.snowplow.snowplow_base_events -model.snowplow.snowplow_base_web_page_context -model.snowplow.snowplow_id_map -model.snowplow.snowplow_page_views -model.snowplow.snowplow_sessions +snowplow.snowplow_base_events +snowplow.snowplow_base_web_page_context +snowplow.snowplow_id_map +snowplow.snowplow_page_views +snowplow.snowplow_sessions ... ``` **Listing tests by tag name** ``` $ dbt ls --select tag:nightly --resource-type test -model.my_project.orders -model.my_project.order_items -model.my_project.products +my_project.schema_test.not_null_orders_order_id +my_project.schema_test.unique_orders_order_id +my_project.schema_test.not_null_products_product_id +my_project.schema_test.unique_products_product_id ... ``` +**Listing schema tests of incremental models** +``` +$ dbt ls --select config.materialized:incremental,test_type:schema +model.my_project.logs_parsed +model.my_project.events_categorized +``` + **Listing JSON output** ``` $ dbt ls --models snowplow.* --output json diff --git a/website/docs/reference/model-selection-syntax.md b/website/docs/reference/model-selection-syntax.md index 21308ca1ecc..d9393de4fa5 100644 --- a/website/docs/reference/model-selection-syntax.md +++ b/website/docs/reference/model-selection-syntax.md @@ -3,65 +3,76 @@ title: "Model selection syntax" id: "model-selection-syntax" --- -## Overview - dbt's model selection syntax makes it possible to run only specific resources in a given invocation of dbt. The model selection syntax is used for the following subcommands: -| command | argument(s) | -| :-------- | ----------------------------------- | -| run | `--models`, `--exclude` | -| test | `--models`, `--exclude` | -| seed | `--select`, `--exclude` | -| snapshot | `--select`, `--exclude` | -| ls (list) | `--select`, `--models`, `--exclude` | -| compile | `--select`, `--exclude` | +| command | argument(s) | +| :-------- | ------------------------------------------------- | +| run | `--models`, `--exclude`, `--selector` | +| test | `--models`, `--exclude`, `--selector` | +| seed | `--select`, `--exclude` | +| snapshot | `--select`, `--exclude` | +| ls (list) | `--select`, `--models`, `--exclude`, `--selector` | +| compile | `--select`, `--exclude` | ## Specifying models to run -By default, `dbt run` will execute _all_ of the models in the dependency graph. During development (and deployment), it is useful to specify only a subset of models to run. Use the `--models` flag with `dbt run` to select a subset of models to run. Note that the following arguments (`--models` and `--exclude`) also apply to `dbt test`! +By default, `dbt run` will execute _all_ of the models in the dependency graph. During development (and deployment), it is useful to specify only a subset of models to run. Use the `--models` flag with `dbt run` to select a subset of models to run. Note that the following arguments (`--models`, `--exclude`, and `--selector`) also apply to `dbt test`! The `--models` flag accepts one or more arguments. Each argument can be one of: 1. a package name 2. a model name 3. a fully-qualified path to a directory of models -4. a tag selector -5. a source selector -6. a path selector +4. a selector method (`path:`, `tag:`, `config:`, `test_type:`, `test_name:`) Examples: ```bash -dbt run --models my_dbt_project_name # runs all models in your project -dbt run --models my_dbt_model # runs a specific model -dbt run --models path.to.my.models # runs all models in a specific directory -dbt run --models my_package.some_model # run a specific model in a specific package -dbt run --models tag:nightly # run models with the "nightly" tag -dbt run --models path/to/models # run models contained in path/to/models -dbt run --models path/to/my_model.sql # run a specific model by its path +$ dbt run --models my_dbt_project_name # runs all models in your project +$ dbt run --models my_dbt_model # runs a specific model +$ dbt run --models path.to.my.models # runs all models in a specific directory +$ dbt run --models my_package.some_model # run a specific model in a specific package +$ dbt run --models tag:nightly # run models with the "nightly" tag +$ dbt run --models path/to/models # run models contained in path/to/models +$ dbt run --models path/to/my_model.sql # run a specific model by its path +$ dbt run --models # multiple arguments can be provided to --models -dbt run --models my_first_model my_second_model +$ dbt run --models my_first_model my_second_model # these arguments can be projects, models, directory paths, tags, or sources -dbt run --models tag:nightly my_model finance.base.* +$ dbt run --models tag:nightly my_model finance.base.* + +# use methods and intersections for more complex selectors +$ dbt run --models path:marts/finance,tag:nightly,config.materialized:table ``` -## Model selection shorthand +## Model selection The flags `--models`, `--model`, and `-m` are all equivalent ways to select models in `dbt run` and `dbt test` invocations. +Tests are associated with models; it is possible to select them based on properties -## Model Selectors -dbt supports a shorthand language for selecting models to run. This language uses the characters `+`, `@`, and `*`. +## Operators +dbt supports a shorthand language for selecting nodes to run. This language uses the characters `+`, `@`, and `*`. ### The "plus" operator If placed at the front of the model selector, `+` will select all parents of the selected model. If placed at the end of the string, `+` will select all children of the selected model. ```bash -dbt run --models my_model+ # select my_model and all children -dbt run --models +my_model # select my_model and all parents -dbt run --models +my_model+ # select my_model, and all of its parents and children +$ dbt run --models my_model+ # select my_model and all children +$ dbt run --models +my_model # select my_model and all parents +$ dbt run --models +my_model+ # select my_model, and all of its parents and children +``` + +### The ["n-plus"](https://nplusonemag.com/) operator +You can adjust the behavior of the `+` operator by quantifying the number of edges +to step through. + +```bash +$ dbt run --models my_model+1 # select my_model and its first-degree children +$ dbt run --models 2+my_model # select my_model, its first-degree parents, and its second-degree parents ("grandparents") +$ dbt run --models 3+my_model+4 # select my_model, its parents up to the 3rd degree, and its children down to the 4th degree ``` ### The "at" operator @@ -73,30 +84,85 @@ The `@` operator is similar to `+`, but will also include _the parents of the ch The `*` operator matches all models within a package or directory. ```bash -dbt run --models snowplow.* # run all of the models in the snowplow package -dbt run --models finance.base.* # run all of the models in models/finance/base +$ dbt run --models snowplow.* # run all of the models in the snowplow package +$ dbt run --models finance.base.* # run all of the models in models/finance/base ``` -### The "tag:" operator -The `tag:` prefix is used to select models that match a specified [tag](tags) . +## Set Operators + +### Unions +Providing multiple space-delineated arguments to the `--models`, `--exclude`, or `--selector` flags selects +the union of them all. If a resource is included in at least one selector, it will be +included in the final set. +### Intersections +New in v0.18.0 +If multiple arguments to `--models`, `--exclude`, and `--select` can be comma-separated (with no whitespace in between), +dbt will select only resources which satisfy _all_ arguments. + +Run all the common ancestors of snowplow_sessions and fct_orders: +```bash +$ dbt run --models +snowplow_sessions,+fct_orders +``` + +Run all the common descendents of stg_invoices and stg_accounts: +```bash +$ dbt run --models stg_invoices+,stg_accounts+ ``` -dbt run --models tag:nightly # run all models with the `nightly` tag + +Run models that are in the marts/finance subdirectory *and* tagged nightly: +```bash +$ dbt run --models marts.finance,tag:nightly ``` -### The "source:" operator -The `source:` prefix is used to select models that select from a specified [source](using-sources). Use in conjunction with the `+` operator. +### Excluding models +dbt provides an `--exclude` flag with the same semantics as `--models`. Models specified with the `--exclude` flag will be removed from the set of models selected with `--models`. +```bash +$ dbt run --models my_package.*+ --exclude my_package.a_big_model+ ``` -dbt run --models source:snowplow+ # run all models that select from Snowplow sources + +Exclude a specific resource by its name or lineage: + +```bash +# test +$ dbt test --exclude not_null_orders_order_id +$ dbt test --exclude orders + +# seed +$ dbt seed --exclude account_parent_mappings + +# snapshot +$ dbt snapshot --exclude snap_order_statuses +$ dbt test --exclude orders+ ``` -### The "path:" operator -The `path:` prefix is used to select models located at or under a specific path. -While the `path:` prefix is not explicitly required, it may be used to make -selectors unambiguous. +## Methods + +Selector methods return all resources that share a common property, using the +syntax `method:value`. + +### The "tag" method +The `tag` method is used to select models that match a specified [tag](tags) . + +```bash +$ dbt run --models tag:nightly # run all models with the `nightly` tag +``` + +### The "source" method +The `source` method is used to select models that select from a specified [source](using-sources). Use in conjunction with the `+` operator. + +```bash +$ dbt run --models source:snowplow+ # run all models that select from Snowplow sources ``` + +### The "path" method +The `path` method is used to select models located at or under a specific path. +While the `path` prefix is not explicitly required, it may be used to make +selectors unambiguous. + +```bash # These two selectors are equivalent dbt run --models path:models/staging/github dbt run --models models/staging/github @@ -106,45 +172,253 @@ dbt run --models path:models/staging/github/stg_issues.sql dbt run --models models/staging/github/stg_issues.sql ``` +### The "package" method +New in v0.18.0 +The `package` method is used to select models defined within the root project +or an installed dbt package. While the `package:` prefix is not explicitly required, it may be used to make +selectors unambiguous. + +```bash +# These three selectors are equivalent +dbt run --models package:snowplow +dbt run --models snowplow +dbt run --models snowplow.* +``` + +### The "config" method +New in v0.18.0 +The `config` method is used to select models that match a specified [node config](config). + +```bash +$ dbt run --models config.materialized:incremental # run all models that are materialized incrementally +$ dbt run --models config.schema:audit # run all models that are created in the `audit` schema +$ dbt run --models config.cluster_by:geo_country # run all models clustered by `geo_country` +``` + +### The "test_type" method +New in v0.18.0 +The `test_type` method is used to select tests based on their type, `schema` or `data`: -### Putting it all together ```bash +$ dbt test --models test_type:schema # run all schema tests +$ dbt test --models test_type:data # run all data tests +``` -dbt run --models my_package.*+ # select all models in my_package and their children -dbt run --models +some_model+ # select some_model and all parents and children +### The "test_name" method +New in v0.18.0 +The `test_name` method is used to select schema tests based on the name of the `test_` macro +that defines it. For more information about how schema tests are defined, read about +[custom schema tests](custom-schema-tests). + +```bash +$ dbt test --models test_name:unique # run all instances of the `unique` test +$ dbt test --models test_name:equality # run all instances of the `dbt_utils.equality` test +$ dbt test --models test_name:range_min_max # run all instances of a custom schema test defined in the local project, `range_min_max` +``` -dbt run --models tag:nightly+ # select "nightly" models and all children -dbt run --models +tag:nightly+ # select "nightly" models and all parents and children -dbt run --models @source:snowplow # build all models that select from snowplow sources, plus their parents +## Putting it all together +```bash +$ dbt run --models my_package.*+ # select all models in my_package and their children +$ dbt run --models +some_model+ # select some_model and all parents and children + +$ dbt run --models tag:nightly+ # select "nightly" models and all children +$ dbt run --models +tag:nightly+ # select "nightly" models and all parents and children + +$ dbt run --models @source:snowplow # build all models that select from snowplow sources, plus their parents + +$ dbt test --models config.incremental_strategy:insert_overwrite,test_name:unique # execute all `unique` tests that select from models using the `insert_overwrite` incremental strategy ``` -## Excluding models -dbt provides an `--exclude` flag with the same semantics as `--models`. Models specified with the `--exclude` flag will be removed from the set of models selected with `--models` +This can get complex! Let's say I want a nightly run of models that build off snowplow data +and feed exports, while _excluding_ the biggest incremental models (and one other model, to boot). ```bash -dbt run --models my_package.*+ --exclude my_package.a_big_model+ +$ dbt run --models @source:snowplow,tag:nightly models/export --exclude package:snowplow,config.materialized:incremental export_performance_timing +``` + +This command selects all models that: +* Select from snowplow sources, plus their parents, _and_ are tagged "nightly" +* Are defined in the `export` model subfolder + +Except for models that are: +* Defined in the snowplow package and materialized incrementally +* Named `export_performance_timing` + + +## Selectors +New in v0.18.0 + +Write model selectors in YML, save them with a human-friendly name, and reference them using the `--selector` flag. +By recording selectors in a top-level `selectors.yml` file: + +* **Legibility:** complex selection criteria are composed of dictionaries and arrays +* **Version control:** selector definitions are stored in the same git repository as the dbt project +* **Reusability:** selectors can be referenced in multiple job definitions, and their definitions are extensible (via YML anchors) + +Selectors each have a `name` and a `definition`. Each `definition` is comprised of +one or more arguments, which can be one of the following: +* **CLI-style:** strings, representing CLI-style) arguments +* **Key-value:** pairs in the form `method: value` +* **Dictionaries:** `method`, `value`, operator-equivalent keywords, and support for `exclude` + +Use `union` and `intersection` to organize multiple arguments. + +#### CLI-style +```yml +definition: + 'tag:nightly' +``` + +This simple syntax supports use of the `+`, `@`, and `*` operators. It does +not support `exclude`. + +#### Key-value +```yml +definition: + tag: nightly +``` + +This simple syntax does not support any operators or `exclude`. + +#### Dictionaries +```yml +definition: + method: tag + value: nightly +``` + +Optional keywords map to the `+` and `@` operators: +```yml + children: true | false + parents: true | false + + children_depth: 1 # if children: true, degrees to include + parents_depth: 1 # if parents: true, degrees to include + + childrens_parents: true | false # @ operator +``` + +The `*` operator to select all nodes can be written as: +```yml +definition: + method: fqn + value: "*" +``` + +The `exclude` keyword may be passed as an argument to each dictionary, or as +an item in a `union`. The following are equivalent: + +```yml +- method: tag + value: nightly + exclude: + - "@tag:daily" +``` + +```yml +- union: + - method: tag + value: nightly + - exclude: + - method: tag + value: daily +``` + +Here is the same example from above, written two different ways: + + + + + + +```yml +selectors: + - name: nightly_diet_snowplow + definition: + union: + - intersection: + - '@source:snowplow' + - 'tag:nightly' + - 'models/export' + - exclude: + - intersection: + - 'package:snowplow' + - 'config.materialized:incremental' + - export_performance_timing +``` + + + + + + +```yml +selectors: + - name: nightly_diet_snowplow + definition: + union: + - intersection: + - method: source + value: snowplow + childrens_parents: true + - method: tag + value: nightly + - method: path + value: models/export + - exclude: + - intersection: + - method: package + value: snowplow + - method: config.materialized + value: incremental + - method: fqn + value: export_performance_timing ``` + + + + +Then in our job definition: +```bash +$ dbt run --select nightly_diet_snowplow +``` ## Test selection examples -The test selection syntax grew out of the model selection syntax. As such, the syntax will look familiar if you wish to : +The test selection syntax grew out of the model selection syntax. As such, the syntax will look familiar if you wish to: * run tests on a particular model * run tests on models in a sub directory * run tests on all models upstream / downstream of a model, etc. -However, things start to get a little unfamiliar when you want to test things other than models, so we've included lots of examples below. In the future, we plan to make this syntax more intuitive. +Tests have their own properties _and_ inherit the properties of the nodes they select from. This means you: +* select tests based on the file path of the models being tested, rather than the file paths of the `.yml` files that configure the tests +* can use selector methods that check config properties of the resources being tested + +Things start to get a little unfamiliar when you want to test things other than models, so we've included lots of examples below. In the future, we plan to make this syntax more intuitive. ### Run schema tests only ```shell -$ dbt test --schema +$ dbt test --models test_type:schema + +# before v0.18.0: +$ dbt test --schema # technically this runs all schema tests, tests tagged 'schema', and tests of models tagged 'schema' ``` ### Run data tests only ```shell -$ dbt test --data +$ dbt test --models test_type:data + +# before v0.18.0: +$ dbt test --data # technically this runs all data tests, tests tagged 'data', and tests of models tagged 'data' ``` ### Run tests on a particular model @@ -179,6 +453,9 @@ $ dbt tests --models +stg_customers # Run tests on all models with a particular tag $ dbt test --models tag:my_model_tag +# Run tests on all models with a particular materialization +$ dbt test --models config.materialized:table + ``` ### Run tests on all sources @@ -269,9 +546,3 @@ models: ```shell $ dbt test --models tag:my_test_tag ``` - - diff --git a/website/sidebars.js b/website/sidebars.js index 6b94e87008a..345827eba20 100755 --- a/website/sidebars.js +++ b/website/sidebars.js @@ -75,6 +75,7 @@ module.exports = { "docs/guides/migration-guide/upgrading-to-0-15-0", "docs/guides/migration-guide/upgrading-to-0-16-0", "docs/guides/migration-guide/upgrading-to-0-17-0", + "docs/guides/migration-guide/upgrading-to-0-18-0", ], }, "docs/guides/videos",