Skip to content

Commit

Permalink
Create documentation page for Python SDK unrecoverable errors (#28702)
Browse files Browse the repository at this point in the history
* Create documentation page for Python SDK unrecoverable errors

* Trailing whitespace

* Add link to page on Python SDK page

* Add to sidebar

* Ditch table for text

* Address comments

* Update website/www/site/content/en/documentation/sdks/python-unrecoverable-errors.md

Co-authored-by: tvalentyn <[email protected]>

* anchor links + edit

* last anchor

* remove anchor links (didn't render in the sidebar correctly)

---------

Co-authored-by: tvalentyn <[email protected]>
  • Loading branch information
jrmccluskey and tvalentyn authored Sep 29, 2023
1 parent e7ec5fe commit c5b75ed
Show file tree
Hide file tree
Showing 3 changed files with 66 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
---
type: languages
title: "Unrecoverable Errors in Beam Python"
---
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

# Unrecoverable Errors in Beam Python

## What is an Unrecoverable Error?

An unrecoverable error is an issue at job start-up time that will
prevent a job from ever running successfully, usually due to some kind
of misconfiguration. Solving these issues when they occur is key to
successfully running a Beam Python pipeline.

## Common Unrecoverable Errors

### Job Submission/Runtime Python Version Mismatch

If the Python version used for job submission does not match the
Python version used to build the worker container, the job will not
execute. Ensure that the Python version being used for job submission
and the container Python version match.

### PIP Dependency Resolution Failures

During worker start-up, dependencies are checked and installed in
the worker container before accepting work. If a pipeline requires
additional dependencies not already present in the runtime environment,
they are installed here. If there’s an issue during this process
(e.g. a dependency version cannot be found, or a worker cannot
connect to PyPI) the worker will fail and may try to restart
depending on the runner. Ensure that dependency versions provided in
your requirements.txt file exist and can be installed locally before
submitting jobs.

### Dependency Version Mismatches

When additional dependencies like `torch`, `transformers`, etc. are not
specified via a requirements_file or preinstalled in a custom container
then the worker might fail to deserialize (unpickle) the user code.
This can result in `ModuleNotFound` errors. If dependencies are installed
but their versions don't match the versions in submission environment,
pipeline might have `AttributeError` messages.

Ensure that the required dependencies at runtime and in the submission
environment are the same along with their versions. For better visibility,
debug logs are added specifying the dependencies at both stages starting in
Beam 2.52.0. For more information, see: https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/#control-dependencies
4 changes: 4 additions & 0 deletions website/www/site/content/en/documentation/sdks/python.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,3 +59,7 @@ see [Machine Learning](/documentation/sdks/python-machine-learning).
## Python multi-language pipelines quickstart

Apache Beam lets you combine transforms written in any supported SDK language and use them in one multi-language pipeline. To learn how to create a multi-language pipeline using the Python SDK, see the [Python multi-language pipelines quickstart](/documentation/sdks/python-multi-language-pipelines).

## Unrecoverable Errors in Beam Python

Some common errors can occur during worker start-up and prevent jobs from starting. To learn about these errors and how to troubleshoot them in the Python SDK, see [Unrecoverable Errors in Beam Python](/documentation/sdks/python-unrecoverable-errors).
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@
<li><a href="/documentation/sdks/python-machine-learning/">Machine Learning</a></li>
<li><a href="/documentation/sdks/python-pipeline-dependencies/">Managing pipeline dependencies</a></li>
<li><a href="/documentation/sdks/python-multi-language-pipelines/">Python multi-language pipelines quickstart</a></li>
<li><a href="/documentation/sdks/python-unrecoverable-errors/">Python Unrecoverable Errors</a></li>
</ul>
</li>

Expand Down

0 comments on commit c5b75ed

Please sign in to comment.