subcategory |
---|
Compute |
Use databricks_pipeline
to deploy Delta Live Tables.
resource "databricks_notebook" "dlt_demo" {
#...
}
resource "databricks_pipeline" "this" {
name = "Pipeline Name"
storage = "/test/first-pipeline"
configuration = {
key1 = "value1"
key2 = "value2"
}
cluster {
label = "default"
num_workers = 2
custom_tags = {
cluster_type = "default"
}
}
cluster {
label = "maintenance"
num_workers = 1
custom_tags = {
cluster_type = "maintenance"
}
}
library {
notebook {
path = databricks_notebook.dlt_demo.id
}
}
continuous = false
}
The following arguments are supported:
name
- A user-friendly name for this pipeline. The name can be used to identify pipeline jobs in the UI.storage
- A location on DBFS or cloud storage where output data and metadata required for pipeline execution are stored. By default, tables are stored in a subdirectory of this location. Change of this parameter forces recreation of the pipeline.configuration
- An optional list of values to apply to the entire pipeline. Elements must be formatted as key:value pairs.library
blocks - Specifies pipeline code and required artifacts. Syntax resembles library configuration block with the addition of a specialnotebook
type of library that should have thepath
attribute. Right now only thenotebook
type is supported.cluster
blocks - Clusters to run the pipeline. If none is specified, pipelines will automatically select a default cluster configuration for the pipeline. Please note that DLT pipeline clusters are supporting only subset of attributes as described in documentation. Also, note thatautoscale
block is extended with themode
parameter that controls the autoscaling algorithm (possible values areENHANCED
for new, enhanced autoscaling algorithm, orLEGACY
for old algorithm).continuous
- A flag indicating whether to run the pipeline continuously. The default value isfalse
.development
- A flag indicating whether to run the pipeline in development mode. The default value istrue
.photon
- A flag indicating whether to use Photon engine. The default value isfalse
.target
- The name of a database for persisting pipeline output data. Configuring the target setting allows you to view and query the pipeline output data from the Databricks UI.edition
- optional name of the product edition. Supported values are:core
,pro
,advanced
(default).channel
- optional name of the release channel for Spark version used by DLT pipeline. Supported values are:current
(default) andpreview
.
The resource job can be imported using the id of the pipeline
$ terraform import databricks_pipeline.this <pipeline-id>
The following resources are often used in the same context:
- End to end workspace management guide.
- databricks_cluster to create Databricks Clusters.
- databricks_job to manage Databricks Jobs to run non-interactive code in a databricks_cluster.
- databricks_notebook to manage Databricks Notebooks.