You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We request the addition of a feature toggle to the DLT data load tool library that allows users to prevent the deletion of files once they are loaded into the final data warehouse. This feature will help create a log of all loaded files and facilitate testing processes.
Are you a dlt user?
Yes, I'm already a dlt user.
Use case
Our primary design goal for most of our data ingestion requirements includes the following stages:
Extract (dlt): Extract data from third-party sources and store it in blob storage in various unstructured formats (JSON, Parquet, CSV).
Load (dlt): Load the extracted data from blob storage into the input schema of our data warehouse. Move unstructured data to an archive path to prevent vendor lock-in and support full loads.
Transform (dbt): Apply tests and business transformations to the tables created by dlt using dbt.
Currently, the staging storage used by dlt to facilitate the loading process is not a true log/archive and is deleted after the load operation. I would like to propose a feature toggle that allows users to deactivate the deletion of any files loaded into the final data warehouse.
Proposed solution
Any classes implementing
classSupportsStagingDestination:
"""Adds capability to support a staging destination for the load"""defshould_load_data_to_staging_dataset_on_staging_destination(
self, table: TTableSchema
) ->bool:
returnFalsedefshould_truncate_table_before_load_on_staging_destination(self, table: TTableSchema) ->bool:
# the default is to truncate the tables on the staging destination...returnTrue
should have an option to override the default behavior of should_truncate_table_before_load_on_staging_destination (key is yet to be determined). With of course the default being False in order not to break any existing clients.
Related issues
No response
The text was updated successfully, but these errors were encountered:
Feature description
We request the addition of a feature toggle to the DLT data load tool library that allows users to prevent the deletion of files once they are loaded into the final data warehouse. This feature will help create a log of all loaded files and facilitate testing processes.
Are you a dlt user?
Yes, I'm already a dlt user.
Use case
Our primary design goal for most of our data ingestion requirements includes the following stages:
Currently, the staging storage used by dlt to facilitate the loading process is not a true log/archive and is deleted after the load operation. I would like to propose a feature toggle that allows users to deactivate the deletion of any files loaded into the final data warehouse.
Proposed solution
Any classes implementing
should have an option to override the default behavior of
should_truncate_table_before_load_on_staging_destination
(key is yet to be determined). With of course the default beingFalse
in order not to break any existing clients.Related issues
No response
The text was updated successfully, but these errors were encountered: