Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-43536: [Python][CI] Add a Crossbow job with the free-threaded build #43671

Merged
merged 10 commits into from
Sep 9, 2024

Conversation

lysnikolaou
Copy link
Contributor

@lysnikolaou lysnikolaou commented Aug 13, 2024

Rationale for this change

Testing with the free-threaded build is required for adding support for it. (see #43536)

What changes are included in this PR?

  • Add a Docker build with the CPython free-threaded build from deadsnakes.
  • Add a Crossbow job to run said Docker build with Python 3.13t

Are there any user-facing changes?

No.

@jorisvandenbossche
Copy link
Member

@github-actions crossbow submit test-ubuntu-22.04-python-313-freethreading

Copy link

Revision: 1cb315b

Submitted crossbow builds: ursacomputing/crossbow @ actions-5e60b16dcf

Task Status
test-ubuntu-22.04-python-313-freethreading GitHub Actions

@lysnikolaou
Copy link
Contributor Author

@github-actions crossbow submit test-ubuntu-22.04-python-313-freethreading

Copy link

Only contributors can submit requests to this bot. Please ask someone from the community for help with getting the first commit in.
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/10370420938

@jorisvandenbossche
Copy link
Member

@github-actions crossbow submit test-ubuntu-22.04-python-313-freethreading

Copy link

Revision: d16c04c

Submitted crossbow builds: ursacomputing/crossbow @ actions-37df7dbfe0

Task Status
test-ubuntu-22.04-python-313-freethreading GitHub Actions

@lysnikolaou
Copy link
Contributor Author

Hmm, we got much further this time. Build was successful, but tests failed on import with this error:

+ pytest -r s --pyargs pyarrow
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError
============================= test session starts ==============================
platform linux -- Python 3.13.0rc1, pytest-8.3.2, pluggy-1.5.0
rootdir: /
plugins: hypothesis-6.111.0
collected 0 items / 1 error

==================================== ERRORS ====================================
_______ ERROR collecting arrow-dev/lib/python3.13t/site-packages/pyarrow _______
usr/lib/python3.13/importlib/__init__.py:88: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1387: in _gcd_import
    ???
<frozen importlib._bootstrap>:1360: in _find_and_load
    ???
<frozen importlib._bootstrap>:1310: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:488: in _call_with_frames_removed
    ???
<frozen importlib._bootstrap>:1387: in _gcd_import
    ???
<frozen importlib._bootstrap>:1360: in _find_and_load
    ???
<frozen importlib._bootstrap>:1331: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:935: in _load_unlocked
    ???
<frozen importlib._bootstrap_external>:1022: in exec_module
    ???
<frozen importlib._bootstrap>:488: in _call_with_frames_removed
    ???
arrow-dev/lib/python3.13t/site-packages/pyarrow/__init__.py:65: in <module>
    import pyarrow.lib as _lib
E   ImportError: Module was compiled with a non-freethreading build of Python but imported into a freethreading build.

That's odd, since the wheel seems to have been compiled with the free-threaded build (notice the "t" in "cp313t"):

Created wheel for pyarrow: filename=pyarrow-18.0.0.dev139-cp313-cp313t-linux_x86_64.whl size=28707111 sha256=13191763e70d010111830b51e95c8a71d582531bb471faa365d00d3daaef330f
  Stored in directory: /tmp/pip-ephem-wheel-cache-h1qj9ap8/wheels/4d/35/fe/ddc412558b8012a6576905791192edd07e7e3d9c86e74f006c
Successfully built pyarrow
Installing collected packages: pyarrow

Successfully installed pyarrow-18.0.0.dev139

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Aug 13, 2024

Isn't that failure expected until #43606 is merged?

EDIT: Ah, but the message of the error is indeed pointing to something being wrong: 'Module was compiled with a non-freethreading build of Python"

@lysnikolaou
Copy link
Contributor Author

Isn't that failure expected until #43606 is merged?

No, I don't think so. The only side-effect of not having -Xfreethreading_compatible is that the GIL is re-enabled. The error here actually says that the C extension modules were built using a non-free-threaded build, which I don't think is the case.

@jorisvandenbossche
Copy link
Member

We do pass -DPYTHON_EXECUTABLE=/arrow-dev/bin/python3 to cmake when building pyarrow. Does the executable binary change when having a free-threaded python? (it's not that this should be python3t or so?)

@jorisvandenbossche
Copy link
Member

pip prints Using pip 24.2 from /arrow-dev/lib/python3.13t/site-packages/pip (python 3.13), so that looks fine.

@lysnikolaou
Copy link
Contributor Author

We do pass -DPYTHON_EXECUTABLE=/arrow-dev/bin/python3 to cmake when building pyarrow. Does the executable binary change when having a free-threaded python? (it's not that this should be python3t or so?)

Since the virtual environment is created using python3.13t, the python and python3 symlinks both point to the free-threaded build.

@lysnikolaou
Copy link
Contributor Author

lysnikolaou commented Aug 13, 2024

It looks like this comes from Cython. I'll have a deeper look. Maybe CMake does not use the correct Cython?

@kou
Copy link
Member

kou commented Aug 13, 2024

Could you create a new issue for this instead of using GH-43536 like #43606?
See also: #43536 (comment)

docker-compose.yml Outdated Show resolved Hide resolved
docker-compose.yml Outdated Show resolved Hide resolved
docker-compose.yml Outdated Show resolved Hide resolved
docker-compose.yml Outdated Show resolved Hide resolved
docker-compose.yml Outdated Show resolved Hide resolved
docker-compose.yml Outdated Show resolved Hide resolved
docker-compose.yml Outdated Show resolved Hide resolved
docker-compose.yml Outdated Show resolved Hide resolved
dev/tasks/tasks.yml Outdated Show resolved Hide resolved
@github-actions github-actions bot added awaiting changes Awaiting changes awaiting change review Awaiting change review and removed awaiting review Awaiting review awaiting changes Awaiting changes labels Aug 13, 2024
docker-compose.yml Outdated Show resolved Hide resolved
@@ -21,6 +21,8 @@
cmake_minimum_required(VERSION 3.16)
project(pyarrow)

set(CMAKE_NO_SYSTEM_FROM_IMPORTED ON)
Copy link
Contributor Author

@lysnikolaou lysnikolaou Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, this was a hard one.

CMake used to add Python include directories with -isystem, which led to some Python-internal includes to resolve to normal 3.13 includes (cause -isystem includes are search after system directories), instead of 3.13-free-threading, which in turn meants that Py_GIL_DISABLED was not set.

Setting this flag uses -I instead. I verified manually that the only include directories here are python-specific (Python & NumPy include directories), so this shouldn't change too much. I couldn't find how I can change this in a more granular way. If someone knows that, help would be really appreaciated!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kou Does this change look ok to you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the SYSTEM target property https://cmake.org/cmake/help/latest/prop_tgt/SYSTEM.html instead of this to limit the impact?

diff --git a/python/CMakeLists.txt b/python/CMakeLists.txt
index 5d5eeaf815..c19820c074 100644
--- a/python/CMakeLists.txt
+++ b/python/CMakeLists.txt
@@ -258,6 +258,8 @@ set(EXECUTABLE_OUTPUT_PATH "${BUILD_OUTPUT_ROOT_DIRECTORY}")
 find_package(Python3Alt REQUIRED)
 message(STATUS "Found NumPy version: ${Python3_NumPy_VERSION}")
 message(STATUS "NumPy include dir: ${NUMPY_INCLUDE_DIRS}")
+# TODO: Describe why we need this
+set_target_properties(Python3::Python PROPERTIES SYSTEM FALSE)
 
 include(UseCython)
 message(STATUS "Found Cython version: ${CYTHON_VERSION}")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the SYSTEM target property https://cmake.org/cmake/help/latest/prop_tgt/SYSTEM.html instead of this to limit the impact?

What my change is the exact opposite though. It's signifying to not use SYSTEM anywhere, which I guess is what FindPython does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the SYSTEM target property https://cmake.org/cmake/help/latest/prop_tgt/SYSTEM.html instead of this to limit the impact?

What my change is the exact opposite though. It's signifying to not use SYSTEM anywhere, which I guess is what FindPython does.

Unless I am mistaken @kou is suggesting, to set SYSTEM to FALSE for Python but leave it as is for the rest of dependencies instead of changing it globally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like @lysnikolaou I tried several variations on this and I could not make it work.

(such as set_target_properties(Python3::Module PROPERTIES SYSTEM FALSE))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CMake docs are actually quite cryptic about this as several properties may be involved: SYSTEM, NO_SYSTEM_FROM_IMPORTED and EXPORT_NO_SYSTEM.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kou We'll have to live with this, unless you want to diagnose the issue yourself. Understanding CMake's intricacies is no fun.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can take a look at this, the system behavior is more compiler defined and exposed by CMake, -I always comes before any -isystem, so if you want to be certain you get it before system headers you need to force it. That is coupled with the other behavior that -isystem silences compiler warnings from system headers. It does seem a shame to change it globally, I see this was already merged but I will see if I can spot where it falls down being more surgical about it. In general -isystem is desirable for imported targets but it gets tricky when you have multiple versions of the same package floating around!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I'll try it later by myself too.

@lysnikolaou
Copy link
Contributor Author

@github-actions crossbow submit test-ubuntu-22.04-python-313-freethreading

Copy link

Only contributors can submit requests to this bot. Please ask someone from the community for help with getting the first commit in.
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/10402069621

@pitrou pitrou force-pushed the add-freethreading-test-build branch from 3c453c5 to dc04e0d Compare September 9, 2024 13:30
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Sep 9, 2024
@pitrou
Copy link
Member

pitrou commented Sep 9, 2024

@github-actions crossbow submit -g wheel

Copy link

github-actions bot commented Sep 9, 2024

Revision: dc04e0d

Submitted crossbow builds: ursacomputing/crossbow @ actions-b07c305e46

Task Status
python-sdist GitHub Actions
wheel-macos-monterey-cp310-amd64 GitHub Actions
wheel-macos-monterey-cp310-arm64 GitHub Actions
wheel-macos-monterey-cp311-amd64 GitHub Actions
wheel-macos-monterey-cp311-arm64 GitHub Actions
wheel-macos-monterey-cp312-amd64 GitHub Actions
wheel-macos-monterey-cp312-arm64 GitHub Actions
wheel-macos-monterey-cp313-amd64 GitHub Actions
wheel-macos-monterey-cp313-arm64 GitHub Actions
wheel-macos-monterey-cp38-amd64 GitHub Actions
wheel-macos-monterey-cp38-arm64 GitHub Actions
wheel-macos-monterey-cp39-amd64 GitHub Actions
wheel-macos-monterey-cp39-arm64 GitHub Actions
wheel-manylinux-2-28-cp310-amd64 GitHub Actions
wheel-manylinux-2-28-cp310-arm64 GitHub Actions
wheel-manylinux-2-28-cp311-amd64 GitHub Actions
wheel-manylinux-2-28-cp311-arm64 GitHub Actions
wheel-manylinux-2-28-cp312-amd64 GitHub Actions
wheel-manylinux-2-28-cp312-arm64 GitHub Actions
wheel-manylinux-2-28-cp313-amd64 GitHub Actions
wheel-manylinux-2-28-cp313-arm64 GitHub Actions
wheel-manylinux-2-28-cp38-amd64 GitHub Actions
wheel-manylinux-2-28-cp38-arm64 GitHub Actions
wheel-manylinux-2-28-cp39-amd64 GitHub Actions
wheel-manylinux-2-28-cp39-arm64 GitHub Actions
wheel-manylinux-2014-cp310-amd64 GitHub Actions
wheel-manylinux-2014-cp310-arm64 GitHub Actions
wheel-manylinux-2014-cp311-amd64 GitHub Actions
wheel-manylinux-2014-cp311-arm64 GitHub Actions
wheel-manylinux-2014-cp312-amd64 GitHub Actions
wheel-manylinux-2014-cp312-arm64 GitHub Actions
wheel-manylinux-2014-cp313-amd64 GitHub Actions
wheel-manylinux-2014-cp313-arm64 GitHub Actions
wheel-manylinux-2014-cp38-amd64 GitHub Actions
wheel-manylinux-2014-cp38-arm64 GitHub Actions
wheel-manylinux-2014-cp39-amd64 GitHub Actions
wheel-manylinux-2014-cp39-arm64 GitHub Actions
wheel-windows-cp310-amd64 GitHub Actions
wheel-windows-cp311-amd64 GitHub Actions
wheel-windows-cp312-amd64 GitHub Actions
wheel-windows-cp313-amd64 GitHub Actions
wheel-windows-cp38-amd64 GitHub Actions
wheel-windows-cp39-amd64 GitHub Actions

@pitrou
Copy link
Member

pitrou commented Sep 9, 2024

@github-actions crossbow submit -g python

Copy link

github-actions bot commented Sep 9, 2024

Revision: dc04e0d

Submitted crossbow builds: ursacomputing/crossbow @ actions-4ad45fc702

Task Status
example-python-minimal-build-fedora-conda GitHub Actions
example-python-minimal-build-ubuntu-venv GitHub Actions
test-conda-python-3.10 GitHub Actions
test-conda-python-3.10-cython2 GitHub Actions
test-conda-python-3.10-hdfs-2.9.2 GitHub Actions
test-conda-python-3.10-hdfs-3.2.1 GitHub Actions
test-conda-python-3.10-pandas-latest-numpy-1.26 GitHub Actions
test-conda-python-3.10-pandas-latest-numpy-latest GitHub Actions
test-conda-python-3.10-pandas-nightly-numpy-nightly GitHub Actions
test-conda-python-3.10-substrait GitHub Actions
test-conda-python-3.11 GitHub Actions
test-conda-python-3.11-dask-latest GitHub Actions
test-conda-python-3.11-dask-upstream_devel GitHub Actions
test-conda-python-3.11-hypothesis GitHub Actions
test-conda-python-3.11-pandas-upstream_devel-numpy-nightly GitHub Actions
test-conda-python-3.11-spark-master GitHub Actions
test-conda-python-3.12 GitHub Actions
test-conda-python-3.12-cpython-debug GitHub Actions
test-conda-python-3.8 GitHub Actions
test-conda-python-3.8-pandas-1.0-numpy-1.19 GitHub Actions
test-conda-python-3.9 GitHub Actions
test-conda-python-3.9-pandas-latest-numpy-latest GitHub Actions
test-conda-python-emscripten GitHub Actions
test-cuda-python GitHub Actions
test-debian-12-python-3-amd64 GitHub Actions
test-debian-12-python-3-i386 GitHub Actions
test-fedora-39-python-3 GitHub Actions
test-ubuntu-20.04-python-3 GitHub Actions
test-ubuntu-22.04-python-3 GitHub Actions
test-ubuntu-22.04-python-313-freethreading GitHub Actions

@jorisvandenbossche
Copy link
Member

FWIW, the minimal-build example builds are failing (I had restarted one, and it's still failing), while I haven't seen that in the nightlies, but I also don't directly see how this PR would be causing that (can take a closer look tomorrow)

(can see if it's still failing after the rebase)

@pitrou
Copy link
Member

pitrou commented Sep 9, 2024

FWIW, the minimal-build example builds are failing (I had restarted one, and it's still failing), while I haven't seen that in the nightlies, but I also don't directly see how this PR would be causing that (can take a closer look tomorrow)

Yes, it's unrelated. The problem is that the fork from which the PR is submitted doesn't have any git tags:
https://github.com/ursacomputing/crossbow/actions/runs/10774498455/job/29876694202#step:3:3401

@pitrou
Copy link
Member

pitrou commented Sep 9, 2024

@lysnikolaou I'm surprised, PyArrow is marked nogil-compatible?

>>> import sys
>>> import pyarrow as pa
>>> sys._is_gil_enabled()
False

@lysnikolaou
Copy link
Contributor Author

@lysnikolaou I'm surprised, PyArrow is marked nogil-compatible?

Yes, this was done in #43606.

@pitrou
Copy link
Member

pitrou commented Sep 9, 2024

But did we actually audit the code to make sure nothing relies on the GIL? Otherwise we're just asking for trouble.

@lysnikolaou
Copy link
Contributor Author

lysnikolaou commented Sep 9, 2024

But did we actually audit the code to make sure nothing relies on the GIL? Otherwise we're just asking for trouble.

Since a big chuck of PyArrow relies on Cython, that means that a lot of the free-threading heavy-lifting will be performed by Cython.

Regarding the C/C++ code, I had a look around the codebase, trying to manually search for global state, but I didn't find any instances that might create problems. I also talked a bit with @jorisvandenbossche about that during a call, and adding threading-related tests was something he wanted to do as well.

As a note, being agressive with declaring support for free-threading is probably the way to go, since bugs will always be there, and they will be much harder to surface if widely-used packages do not declare support. Remember that the GIL is enabled for the whole process, even if just one package hasn't declared support.

@pitrou
Copy link
Member

pitrou commented Sep 9, 2024

Regarding the C/C++ code, I had a look around the codebase, trying to manually search for global state, but I didn't find any instances that might create problems.

Besides explicitly global state, there may also be cases of relying on the GIL to ensure that e.g. a Python dict/list isn't mutated, or a borrowed reference remains valid.

As a note, being agressive with declaring support for free-threading is probably the way to go, since bugs will always be there, and they will be much harder to surface if widely-used packages do not declare support.

That is true.

@lysnikolaou
Copy link
Contributor Author

there may also be cases of relying on the GIL to ensure that e.g. a Python dict/list isn't mutated

If you could point me to such instances, that'd be really helpful, and I can work on fixing all of those.

or a borrowed reference remains valid

We got rid of borrowed references APIs in #43540.

@pitrou
Copy link
Member

pitrou commented Sep 9, 2024

If you could point me to such instances, that'd be really helpful, and I can work on fixing all of those.

One example is VisitSequenceGeneric (

if (PySequence_Check(obj)) {
if (PyList_Check(obj) || PyTuple_Check(obj)) {
// Use fast item access
const Py_ssize_t size = PySequence_Fast_GET_SIZE(obj);
for (Py_ssize_t i = offset; keep_going && i < size; ++i) {
PyObject* value = PySequence_Fast_GET_ITEM(obj, i);
RETURN_NOT_OK(func(value, static_cast<int64_t>(i), &keep_going));
}
} else {
// Regular sequence: avoid making a potentially large copy
const Py_ssize_t size = PySequence_Size(obj);
RETURN_IF_PYERROR();
for (Py_ssize_t i = offset; keep_going && i < size; ++i) {
OwnedRef value_ref(PySequence_ITEM(obj, i));
RETURN_IF_PYERROR();
RETURN_NOT_OK(func(value_ref.obj(), static_cast<int64_t>(i), &keep_going));
}
}
), where the sequence's length is assumed to be constant, and where we use the borrowed ref returned by PySequence_Fast_GET_ITEM. Solving this one efficiently will be tricky, as CPython doesn't provide any APIs for fast and safe GIL-less access to sequences.

A similar issue lies in

if (PyArray_DESCR(arr_obj)->type_num == NPY_OBJECT) {
// It's an array object, we can fetch object pointers directly
const Ndarray1DIndexer<PyObject*> objects(arr_obj);
for (int64_t i = offset; keep_going && i < objects.size(); ++i) {
RETURN_NOT_OK(func(objects[i], i, &keep_going));
}
return Status::OK();
}

I haven't looked in other places.

@lysnikolaou
Copy link
Contributor Author

Thank for this! I'll have a look.

@pitrou
Copy link
Member

pitrou commented Sep 9, 2024

Let's just let CI run a last time and merge if no further issue surfaces.

@pitrou pitrou merged commit d28e542 into apache:main Sep 9, 2024
61 of 62 checks passed
@pitrou pitrou removed the awaiting change review Awaiting change review label Sep 9, 2024
@github-actions github-actions bot added the awaiting committer review Awaiting committer review label Sep 9, 2024
Copy link

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit d28e542.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 39 possible false positives for unstable benchmarks that are known to sometimes produce them.

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Sep 10, 2024
khwilson pushed a commit to khwilson/arrow that referenced this pull request Sep 14, 2024
…d build (apache#43671)

### Rationale for this change

Testing with the free-threaded build is required for adding support for it. (see apache#43536)

### What changes are included in this PR?

- Add a Docker build with the CPython free-threaded build from deadsnakes.
- Add a Crossbow job to run said Docker build with Python 3.13t

### Are there any user-facing changes?

No.
* GitHub Issue: apache#43536

Lead-authored-by: Lysandros Nikolaou <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants