-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create documentation page for Python SDK unrecoverable errors #28702
Changes from 5 commits
7a53da0
c8da589
ad145aa
d3fd914
940c6bc
c4259bd
cb3b309
2d8ca07
77d0f90
5fee528
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
--- | ||
type: languages | ||
title: "Unrecoverable Errors in Beam Python" | ||
--- | ||
<!-- | ||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
|
||
http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
--> | ||
|
||
# Unrecoverable Errors in Beam Python | ||
|
||
## What is an Unrecoverable Error? | ||
|
||
An unrecoverable error is an issue at job start-up time that will | ||
prevent a job from ever running successfully, usually due to some kind | ||
of misconfiguration. Solving these issues when they occur is key to | ||
successfully running a Beam Python pipeline. | ||
|
||
## Common Unrecoverable Errors | ||
|
||
### Job Submission/Runtime Python Version Mismatch | ||
|
||
If the Python version used for job submission does not match the | ||
Python version used to build the worker container, the job will not | ||
execute. Ensure that the Python version being used for job submission | ||
and the container Python version match. | ||
|
||
### PIP Dependency Resolution Failures | ||
|
||
During worker start-up, dependencies are checked and installed in | ||
jrmccluskey marked this conversation as resolved.
Show resolved
Hide resolved
|
||
the worker container before accepting work. If there’s an issue during | ||
this process (e.g. a dependency version cannot be found) the worker | ||
jrmccluskey marked this conversation as resolved.
Show resolved
Hide resolved
|
||
will restart and try again up to four times before outright failing. | ||
jrmccluskey marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Ensure that dependency versions provided in your requirements.txt file | ||
exist and can be installed locally before submitting jobs. | ||
|
||
### Dependency Verision Mismatches | ||
jrmccluskey marked this conversation as resolved.
Show resolved
Hide resolved
jrmccluskey marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
When additional dependencies like torch, transformers, etc are not | ||
specified via requirements_file or preinstalled with a custom container | ||
then the worker may go into a restart loop trying to install dependencies | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Mismatch happens after installation, when worker already started; at this point we won't attempt more installations. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Possible suggestion: When additional dependencies like If dependencies are installed but their versions don't match the versions in submission environment, pipeline might have |
||
again up to 4 times and finally fail. There is a debug log specifying `ModuleNotFoundError`. | ||
A similar outcome is observed when there is a dependency mismatch that | ||
often has `AttributeError` logged in debug mode. Ensure that the required | ||
dependencies at runtime and in the submission environment are the same | ||
along with their versions. For better visibility, debug logs are added | ||
specifying the dependencies at both stages starting in Beam 2.52.0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you plan to reference these errors in logs, let's add markdown anchors for better linkability, as titles might change but we can keep the same anchors, so links will be preserved.