From b305d9b00ad1535ccbdeea9b9f20f75ab4aad8fb Mon Sep 17 00:00:00 2001 From: Willi Date: Tue, 3 Sep 2024 17:26:15 +0530 Subject: [PATCH] updates docs on terminal exceptions on failed jobs --- .../docs/running-in-production/running.md | 30 +++++++++---------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/docs/website/docs/running-in-production/running.md b/docs/website/docs/running-in-production/running.md index cc089a1393..72a5bd463a 100644 --- a/docs/website/docs/running-in-production/running.md +++ b/docs/website/docs/running-in-production/running.md @@ -259,9 +259,21 @@ def check(ex: Exception): ### Failed jobs -If any job in the package **fail terminally** it will be moved to `failed_jobs` folder and assigned -such status. By default **no exception is raised** and other jobs will be processed and completed. -You may inspect if the failed jobs are present by checking the load info as follows: +If any job in the package **fails terminally** it will be moved to `failed_jobs` folder and assigned +such status. +By default, **an exceptions is raised** and on the first failed job, the load package will be aborted with `LoadClientJobFailed` (terminal exception). +Such package will be completed but its load id is not added to the `_dlt_loads` table. +All the jobs that were running in parallel are completed before raising. The dlt state, if present, will not be visible to `dlt`. +Here is an example `config.toml` to disable this behavior: + +```toml +# you should really load just one job at a time to get the deterministic behavior +load.workers=1 +# I hope you know what you are doing by setting this to false +load.raise_on_failed_jobs=false +``` + +If you prefer dlt to to not raise a terminal exception on failed jobs then you can manually check for failed jobs and raise an exception by checking the load info as follows: ```py # returns True if there are failed jobs in any of the load packages @@ -270,18 +282,6 @@ print(load_info.has_failed_jobs) load_info.raise_on_failed_jobs() ``` -You may also abort the load package with `LoadClientJobFailed` (terminal exception) on a first -failed job. Such package is will be completed but its load id is not added to the -`_dlt_loads` table. All the jobs that were running in parallel are completed before raising. The dlt -state, if present, will not be visible to `dlt`. Here's example `config.toml` to enable this option: - -```toml -# you should really load just one job at a time to get the deterministic behavior -load.workers=1 -# I hope you know what you are doing by setting this to true -load.raise_on_failed_jobs=true -``` - :::caution Note that certain write dispositions will irreversibly modify your data 1. `replace` write disposition with the default `truncate-and-insert` [strategy](../general-usage/full-loading.md) will truncate tables before loading.