Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: apache beam python SDK hangs and crashes with segmentation fault errors with orjson 3.9.4 #28318

Closed
1 task done
dankuchler opened this issue Sep 5, 2023 · 6 comments
Labels
bug done & done Issue has been reviewed after it was closed for verification, followups, etc. P2 python

Comments

@dankuchler
Copy link

dankuchler commented Sep 5, 2023

What happened?

A bug introduced in orjson dependency (ijl/orjson#415) might cause Beam Python pipelines to crash with a segmentation fault or get stuck. Beam uses orjson in BigQuery IO, users of this IO might be affected.

Mitigation

Until Beam 2.51.0 is released, consider any of the following workarounds:

Original report

In our latest deployment of our apache beam pipeline our dependency for orjson (dependency of the python apache beam SDK) was upgraded from 3.9.2 to 3.9.4.

The apache beam SDK has a dependency on orjson < 4.0 here:

https://github.com/apache/beam/blob/master/sdks/python/setup.py#L233

With this upgrade of orjson from 3.9.2 to 3.9.4 we are periodically seeing our apache beam SDK hang or the workers crash with segmentation fault errors that we believe is related to this issue in the orjson project:

ijl/orjson#415

When reverting from orjson 3.9.4 to 3.9.2 it seems that the issues are resolved.

The python apache beam SDK may want to limit orjson to 3.9.2 or below until orjson issue 415 is resolved.

Issue Priority

Priority: 2

Issue Components

  • Component: Python SDK
@tvalentyn
Copy link
Contributor

thanks for reporting. did you see some stacktraces around the segmentation fault by chance?

@tvalentyn tvalentyn added this to the 2.51.0 Release milestone Sep 5, 2023
@tvalentyn
Copy link
Contributor

Only Beam 2.50.0 has orjson==3.9.4 in Beam containers.

@dankuchler
Copy link
Author

The segmentation fault messages that we are seeing in the system log seems to indicate that it is coming from orjson:

python[1925]: segfault at 7ff9f1ff4000 ip 00007ffa3c72df53 sp 00007ffa391bd000 error 6 in orjson.cpython-310-x86_64-linux-gnu.so[7ffa3c716000+2f000]
python[82009]: segfault at 70 ip 00007f91e601bb09 sp 00007f9189bfa080 error 4 in orjson.cpython-310-x86_64-linux-gnu.so[7f91e600f000+2f000]
python[1915]: segfault at 50 ip 00007f0431827b09 sp 00007f0407ffc1c0 error 4 in orjson.cpython-310-x86_64-linux-gnu.so[7f043181b000+2f000]

@dankuchler
Copy link
Author

In the mitigation section - I believe 3.9.2 is fine - we have been running on orjson 3.9.2 and the issue in the orjson project mentioned people were mitigating by reverting to version 3.9.2

@dankuchler
Copy link
Author

The corresponding orjson issue ijl/orjson#415 now indicates that it is closed and that the underlying issue should be resolved in orjson 3.9.7

@tvalentyn tvalentyn mentioned this issue Sep 12, 2023
3 tasks
@damccorm
Copy link
Contributor

@tvalentyn can this one be closed?

@jrmccluskey jrmccluskey added the done & done Issue has been reviewed after it was closed for verification, followups, etc. label Oct 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug done & done Issue has been reviewed after it was closed for verification, followups, etc. P2 python
Projects
None yet
Development

No branches or pull requests

4 participants