-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-36730: [Python] Add support for Cython 3.0.0 #36745
Conversation
|
This may fix the compilation error but there are other issues during tests (see macOS CI?). |
I've fixed CI failures. |
@kou there were also some runtime errors in the minimal build that are caused by cython 3 (but it's strange that this is only failing in a single build). Based on your diff, I don't think that will already be fixed here, but let me run them to be sure. |
@github-actions crossbow submit example-python-minimal-build-fedora-conda |
Revision: dba6bb4 Submitted crossbow builds: ursacomputing/crossbow @ actions-f3bbf75ac8
|
Can this PR also undo the pinning on various Python builds? |
@github-actions crossbow submit example-python-minimal-build-fedora-conda |
Revision: 81493b1 Submitted crossbow builds: ursacomputing/crossbow @ actions-492c6d5c7d
|
I've fixed the generator related problem. |
Hmm. Can we detect Cython version in |
I want to write something like the following: diff --git a/python/pyarrow/scalar.pxi b/python/pyarrow/scalar.pxi
index f438c8847..61de38203 100644
--- a/python/pyarrow/scalar.pxi
+++ b/python/pyarrow/scalar.pxi
@@ -793,7 +793,10 @@ cdef class MapScalar(ListScalar):
"""
arr = self.values
if array is None:
- raise StopIteration
+ if Cython.__version__ < '3.0.0':
+ raise StopIteration
+ else:
+ return
for k, v in zip(arr.field('key'), arr.field('value')):
yield (k.as_py(), v.as_py())
|
There should not be any need to raise |
python/pyarrow/ipc.pxi
Outdated
def __iter__(self): | ||
while True: | ||
yield self.read_next_message() | ||
try: | ||
yield self.read_next_message() | ||
except StopIteration: | ||
# For Cython >= 3.0.0 | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if it is this code that gives problems to be written in a way that works for both cython 0.29 and 3, but a potential different way of writing it:
def __iter__(self):
return self
def __next__(self):
return self.read_next_message()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I'll try it.
arr = self.values | ||
if array is None: | ||
if arr is None: | ||
raise StopIteration | ||
for k, v in zip(arr.field('key'), arr.field('value')): | ||
yield (k.as_py(), v.as_py()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you actually test these changes before pushing them?
You can't use yield
in a __next__
function.
Either you write __iter__
as a generator (using yield
), or you write a pair of __iter__
and __next__
functions (without yield
). You can't mix the two styles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And FWIW, for this case, I would leave the __iter__
as is (in the other cases such as RecordBatchReader, we already have a public "next" method that can just be called from __next__
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry. I didn't test on local before I pushed a commit because I don't have enough time yesterday.
I reverted the __iter__
+ __next__
change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flight changes LGTM, thanks!
@jorisvandenbossche Do you have any idea why the following failure was happen? https://github.com/apache/arrow/actions/runs/5615711782/job/15216554355?pr=36745#step:6:6074
|
Not directly an idea. We use the same pattern in many other places. Can you try this patch: --- a/python/pyarrow/_dataset_parquet.pyx
+++ b/python/pyarrow/_dataset_parquet.pyx
@@ -741,7 +741,7 @@ cdef class ParquetFragmentScanOptions(FragmentScanOptions):
thrift_string_size_limit=self.thrift_string_size_limit,
thrift_container_size_limit=self.thrift_container_size_limit,
)
- return type(self)._reconstruct, (kwargs,)
+ return ParquetFragmentScanOptions._reconstruct, (kwargs,) (in theory it shouldn't matter for python, but maybe cython 3 has a bug that treats this different) |
Thanks! I've pushed your patch. |
If there are problems with compatibility between 0.29 and 3.0 (aside from the few intentional breaking changes), the Cython devs are usually quick to respond if you open an issue |
The failing pickle test is only when using the
|
Well, first, this PR should be rebased, because there are conflicts. |
I did rebase/squash to test things out here: #37097 I wanted to experiment without polluting this PR. I can merge any fixes I find back into this PR. |
Can we use #37097 instead of this? |
Sure! |
I close this in favor of #37097. |
Rationale for this change
Cython 3.0.0 is the latest release. PyArrow should work with Cython 3.0.0.
What changes are included in this PR?
vector[XXX]&&
postincrement
C4551
warning (function call missing argument list) with MSVCconst
toCLocation
's static methods.StopIteration
to stop generatorAre these changes tested?
Yes.
Are there any user-facing changes?
Yes.