diff --git a/docs/technical/general_usage.md b/docs/technical/general_usage.md index 336c892c66..2df903b062 100644 --- a/docs/technical/general_usage.md +++ b/docs/technical/general_usage.md @@ -47,7 +47,7 @@ Pipeline can be explicitly created and configured via `dlt.pipeline()` that retu 4. dataset_name - name of the dataset where the data goes (see later the default names) 5. import_schema_path - default is None 6. export_schema_path - default is None -7. full_refresh - if set to True the pipeline working dir will be erased and the dataset name will get the unique suffix (current timestamp). ie the `my_data` becomes `my_data_20221107164856`. +7. dev_mode - if set to True the pipeline working dir will be erased and the dataset name will get the unique suffix (current timestamp). ie the `my_data` becomes `my_data_20221107164856`. > **Achtung** as per `secrets_and_config.md` the arguments passed to `dlt.pipeline` are configurable and if skipped will be injected by the config providers. **the values provided explicitly in the code have a full precedence over all config providers** @@ -101,7 +101,7 @@ In case **there are more schemas in the pipeline**, the data will be loaded into 1. `spotify` tables and `labels` will load into `spotify_data_1` 2. `mel` resource will load into `spotify_data_1_echonest` -The `full_refresh` option: dataset name receives a prefix with the current timestamp: ie the `my_data` becomes `my_data_20221107164856`. This allows a non destructive full refresh. Nothing is being deleted/dropped from the destination. +The `dev_mode` option: dataset name receives a prefix with the current timestamp: ie the `my_data` becomes `my_data_20221107164856`. This allows a non destructive full refresh. Nothing is being deleted/dropped from the destination. ## pipeline working directory and state Another fundamental concept is the pipeline working directory. This directory keeps the following information: @@ -117,7 +117,7 @@ The `restore_from_destination` argument to `dlt.pipeline` let's the user restore The state is being stored in the destination together with other data. So only when all pipeline stages are completed the state is available for restoration. -The pipeline cannot be restored if `full_refresh` flag is set. +The pipeline cannot be restored if `dev_mode` flag is set. The other way to trigger full refresh is to drop destination dataset. `dlt` detects that and resets the pipeline local working folder. @@ -155,8 +155,8 @@ The default json normalizer will convert json documents into tables. All the key ❗ [more here](working_with_schemas.md) -### Full refresh mode -If `full_refresh` flag is passed to `dlt.pipeline` then +### Dev mode mode +If `dev_mode` flag is passed to `dlt.pipeline` then 1. the pipeline working dir is fully wiped out (state, schemas, temp files) 2. dataset name receives a prefix with the current timestamp: ie the `my_data` becomes `my_data_20221107164856`. 3. pipeline will not be restored from the destination diff --git a/docs/website/docs/dlt-ecosystem/verified-sources/sql_database.md b/docs/website/docs/dlt-ecosystem/verified-sources/sql_database.md index eeb717515a..c89a63a524 100644 --- a/docs/website/docs/dlt-ecosystem/verified-sources/sql_database.md +++ b/docs/website/docs/dlt-ecosystem/verified-sources/sql_database.md @@ -652,6 +652,6 @@ resource. Below we show you an example on how to pseudonymize the data before it print(info) ``` -1. Remember to keep the pipeline name and destination dataset name consistent. The pipeline name is crucial for retrieving the [state](https://dlthub.com/docs/general-usage/state) from the last run, which is essential for incremental loading. Altering these names could initiate a "[full_refresh](https://dlthub.com/docs/general-usage/pipeline#do-experiments-with-full-refresh)", interfering with the metadata tracking necessary for [incremental loads](https://dlthub.com/docs/general-usage/incremental-loading). +1. Remember to keep the pipeline name and destination dataset name consistent. The pipeline name is crucial for retrieving the [state](https://dlthub.com/docs/general-usage/state) from the last run, which is essential for incremental loading. Altering these names could initiate a "[dev_mode](https://dlthub.com/docs/general-usage/pipeline#do-experiments-with-dev-mode)", interfering with the metadata tracking necessary for [incremental loads](https://dlthub.com/docs/general-usage/incremental-loading). diff --git a/docs/website/docs/dlt-ecosystem/verified-sources/stripe.md b/docs/website/docs/dlt-ecosystem/verified-sources/stripe.md index 8c39a5090e..fdbefeddf1 100644 --- a/docs/website/docs/dlt-ecosystem/verified-sources/stripe.md +++ b/docs/website/docs/dlt-ecosystem/verified-sources/stripe.md @@ -232,6 +232,6 @@ verified source. load_info = pipeline.run(data=[source_single, source_incremental]) print(load_info) ``` - > To load data, maintain the pipeline name and destination dataset name. The pipeline name is vital for accessing the last run's [state](../../general-usage/state), which determines the incremental data load's end date. Altering these names can trigger a [“full_refresh”](../../general-usage/pipeline#do-experiments-with-full-refresh), disrupting the metadata (state) tracking for [incremental data loading](../../general-usage/incremental-loading). + > To load data, maintain the pipeline name and destination dataset name. The pipeline name is vital for accessing the last run's [state](../../general-usage/state), which determines the incremental data load's end date. Altering these names can trigger a [“dev_mode”](../../general-usage/pipeline#do-experiments-with-dev-mode), disrupting the metadata (state) tracking for [incremental data loading](../../general-usage/incremental-loading). diff --git a/docs/website/docs/dlt-ecosystem/verified-sources/workable.md b/docs/website/docs/dlt-ecosystem/verified-sources/workable.md index 472f48a28f..9229ddca7e 100644 --- a/docs/website/docs/dlt-ecosystem/verified-sources/workable.md +++ b/docs/website/docs/dlt-ecosystem/verified-sources/workable.md @@ -272,7 +272,7 @@ To create your data pipeline using single loading and destination dataset names. The pipeline name helps retrieve the [state](https://dlthub.com/docs/general-usage/state) of the last run, essential for incremental data loading. Changing these names might trigger a - [“full_refresh”](https://dlthub.com/docs/general-usage/pipeline#do-experiments-with-full-refresh), + [“dev_mode”](https://dlthub.com/docs/general-usage/pipeline#do-experiments-with-dev-mode), disrupting metadata tracking for [incremental data loading](https://dlthub.com/docs/general-usage/incremental-loading).