-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-43536: [Python][CI] Add a Crossbow job with the free-threaded build #43671
Conversation
@github-actions crossbow submit test-ubuntu-22.04-python-313-freethreading |
Revision: 1cb315b Submitted crossbow builds: ursacomputing/crossbow @ actions-5e60b16dcf
|
@github-actions crossbow submit test-ubuntu-22.04-python-313-freethreading |
|
@github-actions crossbow submit test-ubuntu-22.04-python-313-freethreading |
Revision: d16c04c Submitted crossbow builds: ursacomputing/crossbow @ actions-37df7dbfe0
|
Hmm, we got much further this time. Build was successful, but tests failed on import with this error:
That's odd, since the wheel seems to have been compiled with the free-threaded build (notice the "t" in "cp313t"):
|
Isn't that failure expected until #43606 is merged? EDIT: Ah, but the message of the error is indeed pointing to something being wrong: 'Module was compiled with a non-freethreading build of Python" |
No, I don't think so. The only side-effect of not having |
We do pass |
|
Since the virtual environment is created using |
It looks like this comes from Cython. I'll have a deeper look. Maybe CMake does not use the correct Cython? |
Could you create a new issue for this instead of using GH-43536 like #43606? |
@@ -21,6 +21,8 @@ | |||
cmake_minimum_required(VERSION 3.16) | |||
project(pyarrow) | |||
|
|||
set(CMAKE_NO_SYSTEM_FROM_IMPORTED ON) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, this was a hard one.
CMake used to add Python include directories with -isystem
, which led to some Python-internal includes to resolve to normal 3.13 includes (cause -isystem
includes are search after system directories), instead of 3.13-free-threading, which in turn meants that Py_GIL_DISABLED
was not set.
Setting this flag uses -I
instead. I verified manually that the only include directories here are python-specific (Python & NumPy include directories), so this shouldn't change too much. I couldn't find how I can change this in a more granular way. If someone knows that, help would be really appreaciated!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kou Does this change look ok to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use the SYSTEM
target property https://cmake.org/cmake/help/latest/prop_tgt/SYSTEM.html instead of this to limit the impact?
diff --git a/python/CMakeLists.txt b/python/CMakeLists.txt
index 5d5eeaf815..c19820c074 100644
--- a/python/CMakeLists.txt
+++ b/python/CMakeLists.txt
@@ -258,6 +258,8 @@ set(EXECUTABLE_OUTPUT_PATH "${BUILD_OUTPUT_ROOT_DIRECTORY}")
find_package(Python3Alt REQUIRED)
message(STATUS "Found NumPy version: ${Python3_NumPy_VERSION}")
message(STATUS "NumPy include dir: ${NUMPY_INCLUDE_DIRS}")
+# TODO: Describe why we need this
+set_target_properties(Python3::Python PROPERTIES SYSTEM FALSE)
include(UseCython)
message(STATUS "Found Cython version: ${CYTHON_VERSION}")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use the
SYSTEM
target property https://cmake.org/cmake/help/latest/prop_tgt/SYSTEM.html instead of this to limit the impact?
What my change is the exact opposite though. It's signifying to not use SYSTEM
anywhere, which I guess is what FindPython
does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use the
SYSTEM
target property https://cmake.org/cmake/help/latest/prop_tgt/SYSTEM.html instead of this to limit the impact?What my change is the exact opposite though. It's signifying to not use
SYSTEM
anywhere, which I guess is whatFindPython
does.
Unless I am mistaken @kou is suggesting, to set SYSTEM
to FALSE
for Python but leave it as is for the rest of dependencies instead of changing it globally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like @lysnikolaou I tried several variations on this and I could not make it work.
(such as set_target_properties(Python3::Module PROPERTIES SYSTEM FALSE)
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CMake docs are actually quite cryptic about this as several properties may be involved: SYSTEM
, NO_SYSTEM_FROM_IMPORTED
and EXPORT_NO_SYSTEM
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kou We'll have to live with this, unless you want to diagnose the issue yourself. Understanding CMake's intricacies is no fun.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can take a look at this, the system behavior is more compiler defined and exposed by CMake, -I always comes before any -isystem, so if you want to be certain you get it before system headers you need to force it. That is coupled with the other behavior that -isystem silences compiler warnings from system headers. It does seem a shame to change it globally, I see this was already merged but I will see if I can spot where it falls down being more surgical about it. In general -isystem is desirable for imported targets but it gets tricky when you have multiple versions of the same package floating around!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I'll try it later by myself too.
@github-actions crossbow submit test-ubuntu-22.04-python-313-freethreading |
|
3c453c5
to
dc04e0d
Compare
@github-actions crossbow submit -g wheel |
Revision: dc04e0d Submitted crossbow builds: ursacomputing/crossbow @ actions-b07c305e46 |
@github-actions crossbow submit -g python |
Revision: dc04e0d Submitted crossbow builds: ursacomputing/crossbow @ actions-4ad45fc702 |
FWIW, the minimal-build example builds are failing (I had restarted one, and it's still failing), while I haven't seen that in the nightlies, but I also don't directly see how this PR would be causing that (can take a closer look tomorrow) (can see if it's still failing after the rebase) |
Yes, it's unrelated. The problem is that the fork from which the PR is submitted doesn't have any git tags: |
@lysnikolaou I'm surprised, PyArrow is marked nogil-compatible? >>> import sys
>>> import pyarrow as pa
>>> sys._is_gil_enabled()
False |
Yes, this was done in #43606. |
But did we actually audit the code to make sure nothing relies on the GIL? Otherwise we're just asking for trouble. |
Since a big chuck of PyArrow relies on Cython, that means that a lot of the free-threading heavy-lifting will be performed by Cython. Regarding the C/C++ code, I had a look around the codebase, trying to manually search for global state, but I didn't find any instances that might create problems. I also talked a bit with @jorisvandenbossche about that during a call, and adding threading-related tests was something he wanted to do as well. As a note, being agressive with declaring support for free-threading is probably the way to go, since bugs will always be there, and they will be much harder to surface if widely-used packages do not declare support. Remember that the GIL is enabled for the whole process, even if just one package hasn't declared support. |
Besides explicitly global state, there may also be cases of relying on the GIL to ensure that e.g. a Python dict/list isn't mutated, or a borrowed reference remains valid.
That is true. |
If you could point me to such instances, that'd be really helpful, and I can work on fixing all of those.
We got rid of borrowed references APIs in #43540. |
One example is arrow/python/pyarrow/src/arrow/python/iterators.h Lines 69 to 86 in cd50c32
PySequence_Fast_GET_ITEM . Solving this one efficiently will be tricky, as CPython doesn't provide any APIs for fast and safe GIL-less access to sequences.
A similar issue lies in arrow/python/pyarrow/src/arrow/python/iterators.h Lines 54 to 61 in cd50c32
I haven't looked in other places. |
Thank for this! I'll have a look. |
Let's just let CI run a last time and merge if no further issue surfaces. |
After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit d28e542. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 39 possible false positives for unstable benchmarks that are known to sometimes produce them. |
…d build (apache#43671) ### Rationale for this change Testing with the free-threaded build is required for adding support for it. (see apache#43536) ### What changes are included in this PR? - Add a Docker build with the CPython free-threaded build from deadsnakes. - Add a Crossbow job to run said Docker build with Python 3.13t ### Are there any user-facing changes? No. * GitHub Issue: apache#43536 Lead-authored-by: Lysandros Nikolaou <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Co-authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
Rationale for this change
Testing with the free-threaded build is required for adding support for it. (see #43536)
What changes are included in this PR?
Are there any user-facing changes?
No.