From eafe08b7d9c4bb28695d27d9933ff144cd657714 Mon Sep 17 00:00:00 2001 From: Robert Bradshaw Date: Fri, 18 Oct 2024 14:44:57 -0700 Subject: [PATCH] Update docs on error handling output schema. --- .../www/site/content/en/documentation/sdks/yaml-errors.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/website/www/site/content/en/documentation/sdks/yaml-errors.md b/website/www/site/content/en/documentation/sdks/yaml-errors.md index 8c0d9f06ade3..903e18d6b3c7 100644 --- a/website/www/site/content/en/documentation/sdks/yaml-errors.md +++ b/website/www/site/content/en/documentation/sdks/yaml-errors.md @@ -37,7 +37,8 @@ The `output` parameter is a name that must referenced as an input to another transform that will process the errors (e.g. by writing them out). For example, the following code will write all "good" processed records to one file and -any "bad" records to a separate file. +any "bad" records, along with metadata about what error was encountered, +to a separate file. ``` pipeline: @@ -77,6 +78,8 @@ for a robust pipeline). Note also that the exact format of the error outputs is still being finalized. They can be safely printed and written to outputs, but their precise schema may change in a future version of Beam and should not yet be depended on. +Currently it has, at the very least, an `element` field which holds the element +that caused the error. Some transforms allow for extra arguments in their error_handling config, e.g. for Python functions one can give a `threshold` which limits the relative number