Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-17487: [Python][Packaging][CI] Add support for Python 3.11 #14499

Merged
merged 19 commits into from
Nov 4, 2022

Conversation

raulcd
Copy link
Member

@raulcd raulcd commented Oct 25, 2022

This PR adds jobs to build pyarrow wheels for Python 3.11.

@raulcd
Copy link
Member Author

raulcd commented Oct 25, 2022

@github-actions crossbow submit cp311

@github-actions
Copy link

@github-actions
Copy link

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

@github-actions
Copy link

Revision: 8e2613f

Submitted crossbow builds: ursacomputing/crossbow @ actions-fd1ab80f49

Task Status
wheel-macos-big-sur-cp311-arm64 Github Actions
wheel-macos-big-sur-cp311-universal2 Github Actions
wheel-macos-mojave-cp311-amd64 Github Actions
wheel-manylinux2014-cp311-amd64 Github Actions
wheel-manylinux2014-cp311-arm64 Travis CI
wheel-windows-cp311-amd64 Github Actions

@raulcd
Copy link
Member Author

raulcd commented Oct 25, 2022

@github-actions crossbow submit cp311

@github-actions
Copy link

Unable to match any tasks for `cp311`
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/3321608980

@raulcd
Copy link
Member Author

raulcd commented Oct 25, 2022

@github-actions crossbow submit cp311

@github-actions
Copy link

Revision: 5dc9f31

Submitted crossbow builds: ursacomputing/crossbow @ actions-3375acd0f2

Task Status
wheel-macos-big-sur-cp311-arm64 Github Actions
wheel-macos-big-sur-cp311-universal2 Github Actions
wheel-macos-mojave-cp311-amd64 Github Actions
wheel-manylinux2014-cp311-amd64 Github Actions
wheel-manylinux2014-cp311-arm64 Travis CI
wheel-windows-cp311-amd64 Github Actions

@raulcd
Copy link
Member Author

raulcd commented Oct 26, 2022

@github-actions crossbow submit cp311

@github-actions
Copy link

Revision: 936164b

Submitted crossbow builds: ursacomputing/crossbow @ actions-2255056a88

Task Status
wheel-macos-big-sur-cp311-arm64 Github Actions
wheel-macos-big-sur-cp311-universal2 Github Actions
wheel-macos-mojave-cp311-amd64 Github Actions
wheel-manylinux2014-cp311-amd64 Github Actions
wheel-manylinux2014-cp311-arm64 Travis CI
wheel-windows-cp311-amd64 Github Actions

@potiuk
Copy link
Member

potiuk commented Oct 26, 2022

We are looking forward to this one being merged in Apache Airflow -> Pyarrow is one of the blocking factors to make Airflow work for Py3.11 and I am trying to make all the oss projects that we consided as friends :) a concerted effort to make Py3.11 support works - as Py 3.11 brings mainly huge improvements in performance that our users are eager to start using !

We track it in apache/airflow#27264

If there is any help needed - happy to help also by talking to some dependencies of yours (which are likely also Airflow depenendencies). Good luck with it :)

@pitrou
Copy link
Member

pitrou commented Oct 26, 2022

@raulcd Perhaps try applying this patch?

diff --git a/python/pyproject.toml b/python/pyproject.toml
index edbc4ade6..a799dc761 100644
--- a/python/pyproject.toml
+++ b/python/pyproject.toml
@@ -18,7 +18,7 @@
 [build-system]
 requires = [
     "cython >= 0.29.22",
-    "oldest-supported-numpy>=0.14",
+    "oldest-supported-numpy>=2022.8.16",
     "setuptools_scm",
     "setuptools >= 40.1.0",
     "wheel"
diff --git a/python/requirements-build.txt b/python/requirements-build.txt
index 46eb288c5..927c50d73 100644
--- a/python/requirements-build.txt
+++ b/python/requirements-build.txt
@@ -1,4 +1,4 @@
 cython>=0.29
-oldest-supported-numpy>=0.14
+oldest-supported-numpy>=2022.8.16
 setuptools_scm
 setuptools>=38.6.0
diff --git a/python/requirements-wheel-build.txt b/python/requirements-wheel-build.txt
index 856164f09..a48b30d35 100644
--- a/python/requirements-wheel-build.txt
+++ b/python/requirements-wheel-build.txt
@@ -1,5 +1,5 @@
 cython>=0.29.11
-oldest-supported-numpy>=0.14
+oldest-supported-numpy>=2022.8.16
 setuptools_scm
 setuptools>=58
 wheel
diff --git a/python/requirements-wheel-test.txt b/python/requirements-wheel-test.txt
index 1644b2f8b..665b2ce77 100644
--- a/python/requirements-wheel-test.txt
+++ b/python/requirements-wheel-test.txt
@@ -2,26 +2,8 @@ cffi
 cython
 hypothesis
 pickle5; platform_system != "Windows" and python_version < "3.8"
+oldest-supported-numpy>=2022.8.16
 pytest
 pytest-lazy-fixture
 pytz
 tzdata; sys_platform == 'win32'
-
-numpy==1.19.5; platform_system == "Linux"   and platform_machine == "aarch64" and python_version <  "3.7"
-numpy==1.21.3; platform_system == "Linux"   and platform_machine == "aarch64" and python_version >= "3.7"
-numpy==1.19.5; platform_system == "Linux"   and platform_machine != "aarch64" and python_version <  "3.9"
-numpy==1.21.3; platform_system == "Linux"   and platform_machine != "aarch64" and python_version >= "3.9"
-numpy==1.21.3; platform_system == "Darwin"  and platform_machine == "arm64"
-numpy==1.19.5; platform_system == "Darwin"  and platform_machine != "arm64"   and python_version <  "3.9"
-numpy==1.21.3; platform_system == "Darwin"  and platform_machine != "arm64"   and python_version >= "3.9"
-numpy==1.19.5; platform_system == "Windows"                                   and python_version <  "3.9"
-numpy==1.21.3; platform_system == "Windows"                                   and python_version >= "3.9"
-
-pandas<1.1.0;  platform_system == "Linux"   and platform_machine != "aarch64" and python_version <  "3.8"
-pandas;        platform_system == "Linux"   and platform_machine != "aarch64" and python_version >= "3.8"
-pandas;        platform_system == "Linux"   and platform_machine == "aarch64"
-pandas<1.1.0;  platform_system == "Darwin"  and platform_machine != "arm64"   and python_version <  "3.8"
-pandas;        platform_system == "Darwin"  and platform_machine != "arm64"   and python_version >= "3.8"
-pandas;        platform_system == "Darwin"  and platform_machine == "arm64"
-pandas<1.1.0;  platform_system == "Windows"                                   and python_version <  "3.8"
-pandas;        platform_system == "Windows"                                   and python_version >= "3.8"

@raulcd
Copy link
Member Author

raulcd commented Oct 26, 2022

@raulcd Perhaps try applying this patch?

I tested the patch locally and while the build of the images is successful I got a lot of test failures:

640 failed, 3430 passed, 348 skipped, 15 xfailed, 2 xpassed, 5 warnings, 8 errors in 103.69s (0:01:43)

This is how I reproduce locally:

# generate wheel
PYTHON=3.11 docker-compose build --no-cache --progress plain python-wheel-manylinux-2014
PYTHON=3.11 docker-compose run --rm python-wheel-manylinux-2014
# test wheel
PYTHON=3.11 docker-compose build --no-cache python-wheel-manylinux-test-unittests
PYTHON=3.11 docker-compose run --rm python-wheel-manylinux-test-unittests

@raulcd
Copy link
Member Author

raulcd commented Oct 26, 2022

Wheels are built successfully at the moment, I am going to trigger the job again to validate the MacOS ones but the jobs are failing due to 7 tests failing due to the change of behaviour of repr on the FileType enum, see: python/cpython#94763
Thanks @jorisvandenbossche
We probably can fix those on a following PR

@raulcd
Copy link
Member Author

raulcd commented Oct 26, 2022

@github-actions crossbow submit cp311

@pitrou
Copy link
Member

pitrou commented Oct 26, 2022

This patch should help fix the 3.11 enum issue:

diff --git a/python/pyarrow/_fs.pyx b/python/pyarrow/_fs.pyx
index e7b028a07..557c08149 100644
--- a/python/pyarrow/_fs.pyx
+++ b/python/pyarrow/_fs.pyx
@@ -78,6 +78,12 @@ cdef CFileType _unwrap_file_type(FileType ty) except *:
     assert 0
 
 
+def _file_type_to_string(ty):
+    # Python 3.11 changed str(IntEnum) to return the string representation
+    # of the integer value: https://github.com/python/cpython/issues/94763
+    return f"{ty.__class__.__name__}.{ty._name_}"
+
+
 cdef class FileInfo(_Weakrefable):
     """
     FileSystem entry info.
@@ -185,9 +191,10 @@ cdef class FileInfo(_Weakrefable):
             except ValueError:
                 return ''
 
-        s = '<FileInfo for {!r}: type={}'.format(self.path, str(self.type))
+        s = (f'<FileInfo for {self.path!r}: '
+             f'type={_file_type_to_string(self.type)}')
         if self.is_file:
-            s += ', size={}'.format(self.size)
+            s += f', size={self.size}'
         s += '>'
         return s
 

@github-actions
Copy link

Revision: d5adbac

Submitted crossbow builds: ursacomputing/crossbow @ actions-f88a7ca39e

Task Status
wheel-macos-big-sur-cp311-arm64 Github Actions
wheel-macos-big-sur-cp311-universal2 Github Actions
wheel-macos-mojave-cp311-amd64 Github Actions
wheel-manylinux2014-cp311-amd64 Github Actions
wheel-manylinux2014-cp311-arm64 Travis CI
wheel-windows-cp311-amd64 Github Actions

potiuk added a commit to apache/airflow that referenced this pull request Oct 26, 2022
Python 3.11 has been released as scheduled on October 25, 2022 and
this is the first attempt to see how far Airflow (mostly dependencies)
are from being ready to officially support 3.11.

So far we had to exclude the following dependencies:

- [ ] Pyarrow dependency: apache/arrow#14499
- [ ] Google Provider: #27292
  and googleapis/python-bigquery#1386
- [ ] Databricks Provider:
  databricks/databricks-sql-python#59
- [ ] Papermill Provider: nteract/papermill#700
- [ ] Azure Provider: Azure/azure-uamqp-python#334
  and Azure/azure-sdk-for-python#27066
- [ ] Apache Beam Provider: apache/beam#23848
- [ ] Snowflake Provider:
  snowflakedb/snowflake-connector-python#1294
- [ ] JDBC Provider: jpype-project/jpype#1087
- [ ] Hive Provider: cloudera/python-sasl#30

We might decide to release Airflow in 3.11 with those providers
disabled in case they are lagging behind eventually, but for the
moment we want to work with all the projects in concert to be
able to release all providers (Google Provider requires quite
a lot of work and likely Google Team stepping up and community helping
with migration to latest Goofle cloud libraries)
@pitrou
Copy link
Member

pitrou commented Oct 26, 2022

The tests are trying to compile grpcio, can we avoid that?
https://github.com/ursacomputing/crossbow/actions/runs/3330588690/jobs/5509267855#step:11:116

Either install the GCS testbench on a different Python (with binary wheels), or don't test GCS at all on 3.11.

@kou
Copy link
Member

kou commented Nov 4, 2022

@github-actions crossbow submit -g verify-rc-wheels

@github-actions

This comment was marked as outdated.

@kou
Copy link
Member

kou commented Nov 4, 2022

Ah, it seems that verify-rc-wheels jobs don't support local verification yet.

@kou kou merged commit e21d5aa into apache:master Nov 4, 2022
@Fokko
Copy link
Contributor

Fokko commented Nov 4, 2022

This is awesome, thanks @raulcd for adding support for 3.11 👏🏻

@ursabot
Copy link

ursabot commented Nov 4, 2022

Benchmark runs are scheduled for baseline = 8e3a1e1 and contender = e21d5aa. e21d5aa is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.03% ⬆️0.0%] test-mac-arm
[Finished ⬇️0.27% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.21% ⬆️0.18%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] e21d5aab ec2-t3-xlarge-us-east-2
[Failed] e21d5aab test-mac-arm
[Finished] e21d5aab ursa-i9-9960x
[Finished] e21d5aab ursa-thinkcentre-m75q
[Finished] 8e3a1e1b ec2-t3-xlarge-us-east-2
[Finished] 8e3a1e1b test-mac-arm
[Finished] 8e3a1e1b ursa-i9-9960x
[Finished] 8e3a1e1b ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@ursabot
Copy link

ursabot commented Nov 4, 2022

['Python', 'R'] benchmarks have high level of regressions.
ursa-i9-9960x

@noamcohen97
Copy link

When will an official release become available?

@raulcd
Copy link
Member Author

raulcd commented Nov 7, 2022

When will an official release become available?

Hi @noamcohen97 , wheels are part of the official release of Apache Arrow. I have sent an email to the developers mailing list to ask the rest of the community if a new minor release of Apache Arrow is required. You can join the developers mailing list or follow the thread here:
https://lists.apache.org/thread/xrlztoz8no289rt6kr6qz52b8yjr3mob

@joekohlsdorf
Copy link

Are the wheels which were built during the test runs here downloadable?
They don't have to be on PyPI, we can install them manually from the filesystem to get tests of projects which depend on arrow to run in the meantime.

Unfortunately this project is almost impossible to build on your own, if it was easier we wouldn't be sitting here waiting on wheels.

@robinlandstrom
Copy link

@joekohlsdorf you can find the wheel in the @github-actions links above. For linux on amd64 the last build is here
image

https://github.com/ursacomputing/crossbow/releases/download/actions-e03f7d03d3-github-wheel-manylinux2014-cp311-amd64/pyarrow-11.0.0.dev52-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

I have not tested it yet but at least possible to install in latest python:3.11 docker container 😃

# pip install pyarrow-11.0.0.dev52-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Processing /pyarrow-11.0.0.dev52-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Collecting numpy>=1.16.6
  Downloading numpy-1.23.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 17.1/17.1 MB 73.6 MB/s eta 0:00:00
Installing collected packages: numpy, pyarrow
Successfully installed numpy-1.23.4 pyarrow-11.0.0.dev52
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

[notice] A new release of pip available: 22.3 -> 22.3.1
[notice] To update, run: pip install --upgrade pip
root@98b070b9e41f:/# python
Python 3.11.0 (main, Oct 25 2022, 05:00:36) [GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyarrow
>>>

@raulcd
Copy link
Member Author

raulcd commented Nov 7, 2022

Are the wheels which were built during the test runs here downloadable? They don't have to be on PyPI, we can install them manually from the filesystem to get tests of projects which depend on arrow to run in the meantime.

We do publish nightly development versions of pyarrow as seen on the Python development docs:
https://arrow.apache.org/docs/developers/python.html#installing-nightly-packages
As the documentation says these are not official releases and are development builds that can be used to test downstream libraries CI. These are not official releases and will contain any other change on development.

vibhatha pushed a commit to vibhatha/arrow that referenced this pull request Nov 8, 2022
…che#14499)

This PR adds jobs to build pyarrow wheels for Python 3.11.

Authored-by: Raúl Cumplido <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
kou pushed a commit that referenced this pull request Nov 15, 2022
)

This PR adds jobs to build pyarrow wheels for Python 3.11.

Authored-by: Raúl Cumplido <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
potiuk added a commit to apache/airflow that referenced this pull request Nov 24, 2022
Python 3.11 has been released as scheduled on October 25, 2022 and
this is the first attempt to see how far Airflow (mostly dependencies)
are from being ready to officially support 3.11.

So far we had to exclude the following dependencies:

- [ ] Pyarrow dependency: apache/arrow#14499
- [ ] Google Provider: #27292
  and googleapis/python-bigquery#1386
- [ ] Databricks Provider:
  databricks/databricks-sql-python#59
- [ ] Papermill Provider: nteract/papermill#700
- [ ] Azure Provider: Azure/azure-uamqp-python#334
  and Azure/azure-sdk-for-python#27066
- [ ] Apache Beam Provider: apache/beam#23848
- [ ] Snowflake Provider:
  snowflakedb/snowflake-connector-python#1294
- [ ] JDBC Provider: jpype-project/jpype#1087
- [ ] Hive Provider: cloudera/python-sasl#30

We might decide to release Airflow in 3.11 with those providers
disabled in case they are lagging behind eventually, but for the
moment we want to work with all the projects in concert to be
able to release all providers (Google Provider requires quite
a lot of work and likely Google Team stepping up and community helping
with migration to latest Goofle cloud libraries)
potiuk added a commit to potiuk/airflow that referenced this pull request Jan 19, 2023
Python 3.11 has been released as scheduled on October 25, 2022 and
this is the first attempt to see how far Airflow (mostly dependencies)
are from being ready to officially support 3.11.

So far we had to exclude the following dependencies:

- [ ] Pyarrow dependency: apache/arrow#14499
- [ ] Google Provider: apache#27292
  and googleapis/python-bigquery#1386
- [ ] Databricks Provider:
  databricks/databricks-sql-python#59
- [ ] Papermill Provider: nteract/papermill#700
- [ ] Azure Provider: Azure/azure-uamqp-python#334
  and Azure/azure-sdk-for-python#27066
- [ ] Apache Beam Provider: apache/beam#23848
- [ ] Snowflake Provider:
  snowflakedb/snowflake-connector-python#1294
- [ ] JDBC Provider: jpype-project/jpype#1087
- [ ] Hive Provider: cloudera/python-sasl#30

We might decide to release Airflow in 3.11 with those providers
disabled in case they are lagging behind eventually, but for the
moment we want to work with all the projects in concert to be
able to release all providers (Google Provider requires quite
a lot of work and likely Google Team stepping up and community helping
with migration to latest Goofle cloud libraries)
victor-paltz pushed a commit to criteo/autofaiss that referenced this pull request Jun 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.