-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pandas/io/feather_format.py should call use_threads instead of nthreads to prevent breakage in pyarrow 0.11.0 #23053
Comments
AFAICT, pyarrow doesn't have nightly builds we can test against, so I'm not sure what the best way to test this is. Will probably just have to be manual until 0.11 is released. |
|
From https://anaconda.org/twosigma/pyarrow/files?sort=time&sort_order=desc,
it looks like nightlies stopped being uploaded a few months ago.
Is @cpcloud the right person to ping here?
…On Tue, Oct 9, 2018 at 9:11 AM Uwe L. Korn ***@***.***> wrote:
pyarrow has nightly conda packages on the twosigma channel.
pyarrow==0.11 is released so you can use it to test against it. We're
missing Python 3.7 wheels at the moment but at least for Linux, these will
appear in the next days.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#23053 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHImGDhp4TL6NdN0YN8GneVGT2dH-vks5ujK5-gaJpZM4XPmBW>
.
|
We are seeing the same issue as tests were broken since pyarrow 0.11.0. Using |
@johnolos thanks. AFAIK, no one has submitted a PR updating pandas. If you could do so we'll include it in the next pandas release. |
i provided a pr addressing the issue, however it is not clear to me if I should change the ci deps as well to require pyarrow 0.11.0. |
Now this is an hard error when used against pyarrow 1.11.0 |
Work-around might be useful to some people: import feather
frame = feather.read_dataframe('filename.feather') |
The nthreads argument is no longer supported since pyarrow 0.11.0 and was replaced with use_threads. Hence we deprecate the argument now as well so we can remove it in the future. This commit also: - removes feather-format as a dependency and replaces it with usage of pyarrow directly. - sets CI dependencies to respect the changes above. We test backwards compatibility with pyarrow 0.9.0 as conda does not provide a pyarrow 0.10.0 and the conda-forge version has comatibility issues with the rest of the installed packages. Resolves #23053. Resolves #21639.
The nthreads argument is no longer supported since pyarrow 0.11.0 and was replaced with use_threads. Hence we deprecate the argument now as well so we can remove it in the future. This commit also: - removes feather-format as a dependency and replaces it with usage of pyarrow directly. - sets CI dependencies to respect the changes above. We test backwards compatibility with pyarrow 0.9.0 as conda does not provide a pyarrow 0.10.0 and the conda-forge version has comatibility issues with the rest of the installed packages. Resolves pandas-dev#23053. Resolves pandas-dev#21639.
#23112 So the fix will show up in the next release of pandas/pyarrow? |
Next version of pandas. Aiming to have it out by the end of the year.
…On Tue, Dec 4, 2018 at 12:21 PM Richard Anderson ***@***.***> wrote:
#23112 <#23112> So the fix will
show up in the next release of pandas/pyarrow?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#23053 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIm_1ZurasUJ0Yaj33q1K2sH_NGT-ks5u1r0LgaJpZM4XPmBW>
.
|
The nthreads argument is no longer supported since pyarrow 0.11.0 and was replaced with use_threads. Hence we deprecate the argument now as well so we can remove it in the future. This commit also: - removes feather-format as a dependency and replaces it with usage of pyarrow directly. - sets CI dependencies to respect the changes above. We test backwards compatibility with pyarrow 0.9.0 as conda does not provide a pyarrow 0.10.0 and the conda-forge version has comatibility issues with the rest of the installed packages. Resolves pandas-dev#23053. Resolves pandas-dev#21639.
The nthreads argument is no longer supported since pyarrow 0.11.0 and was replaced with use_threads. Hence we deprecate the argument now as well so we can remove it in the future. This commit also: - removes feather-format as a dependency and replaces it with usage of pyarrow directly. - sets CI dependencies to respect the changes above. We test backwards compatibility with pyarrow 0.9.0 as conda does not provide a pyarrow 0.10.0 and the conda-forge version has comatibility issues with the rest of the installed packages. Resolves pandas-dev#23053. Resolves pandas-dev#21639.
Code Sample
Problem description
Pandas introduced nthreads for reading feather files in issue 16359
With PyArrow 0.10.0 a deprecation warning is shown from this source: "
nthreads
argument is deprecated, passuse_threads
instead"When PyArrow version 0.11.0, Python errors with: TypeError: read_feather() got an unexpected keyword argument 'nthreads'.
I've searched with 'pyarrow' and 'nthreads' keywords and didn't see this issue posted.
Specifically feather-format.py line 112 should be changed to
return feather.read_dataframe(path, use_threads=True)
or changing the method signature to all overriding use_threads:return feather.read_dataframe(path, use_threads=use_threads)
I will submit a PR if the only barrier to fix is code effort.
Expected Output
I expect no error output upon running pandas.read_feather() with PyArrow 0.11.0
Output of
pd.show_versions()
[paste the output of
pd.show_versions()
here below this line]INSTALLED VERSIONS
commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 79 Stepping 1, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.23.4
pytest: None
pip: 18.1
setuptools: 40.3.0
Cython: None
numpy: 1.15.1
scipy: 1.1.0
pyarrow: 0.10.0
xarray: None
IPython: 6.5.0
sphinx: None
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: 0.4.0
matplotlib: 2.2.2
openpyxl: None
xlrd: 1.1.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: 1.2.11
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: