Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Upgrade arrow & pyarrow to 6.0.1 #9686

Merged
merged 46 commits into from
Feb 9, 2022
Merged
Show file tree
Hide file tree
Changes from 43 commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
846dee9
update arrow to 6.0.0
galipremsagar Nov 15, 2021
04abaa7
temp ci commands'
galipremsagar Nov 15, 2021
f1af72a
Merge branch 'rapidsai:branch-22.02' into 9645
galipremsagar Nov 22, 2021
309a888
Merge branch 'rapidsai:branch-22.02' into 9645
galipremsagar Nov 23, 2021
bab085e
Merge branch 'rapidsai:branch-22.02' into 9645
galipremsagar Dec 2, 2021
38c7980
Merge remote-tracking branch 'upstream/branch-22.02' into 9645
galipremsagar Dec 6, 2021
02a1aea
bump arrow version to 6.0.1
galipremsagar Dec 6, 2021
17fdb88
Merge branch '9645' of https://github.com/galipremsagar/cudf into 9645
galipremsagar Dec 6, 2021
26e3e6f
merge
galipremsagar Jan 5, 2022
46399bc
Merge branch 'branch-22.02' into 9645
galipremsagar Jan 11, 2022
553e246
Update ci/cpu/build.sh
galipremsagar Jan 11, 2022
1dab73f
Update ci/cpu/build.sh
galipremsagar Jan 11, 2022
89967c8
Update build.sh
galipremsagar Jan 11, 2022
56eb57b
Update build.sh
galipremsagar Jan 11, 2022
e3ca27a
Update build.sh
galipremsagar Jan 12, 2022
f75bbc6
Update build.sh
galipremsagar Jan 12, 2022
8cd944a
Update build.sh
galipremsagar Jan 12, 2022
71f6278
Update build.sh
galipremsagar Jan 12, 2022
3bcba33
Update build.sh
galipremsagar Jan 12, 2022
1115931
Update java.sh
galipremsagar Jan 12, 2022
e64edd9
Update build.sh
galipremsagar Jan 12, 2022
5ac103d
Update build.sh
galipremsagar Jan 12, 2022
d6b9133
Update build.sh
galipremsagar Jan 12, 2022
b9a0db0
Update build.sh
galipremsagar Jan 12, 2022
b601e60
Update java.sh
galipremsagar Jan 12, 2022
77b6884
Update java.sh
galipremsagar Jan 13, 2022
5c0ad90
Merge
galipremsagar Jan 15, 2022
ac9b078
Merge branch 'rapidsai:branch-22.04' into 9645
galipremsagar Jan 21, 2022
9542ef9
Merge branch 'rapidsai:branch-22.04' into 9645
galipremsagar Jan 24, 2022
c76a755
Merge remote-tracking branch 'upstream/branch-22.04' into 9645
galipremsagar Jan 24, 2022
c9eb9e4
fix docs
galipremsagar Jan 24, 2022
c5d2b4a
merge
galipremsagar Jan 24, 2022
b1547d3
Merge remote-tracking branch 'upstream/branch-22.04' into 9645
galipremsagar Feb 5, 2022
26c3ec6
remove todo
galipremsagar Feb 5, 2022
fef1514
Remove temp commits - 1
galipremsagar Feb 8, 2022
9a80391
Update build.sh
galipremsagar Feb 8, 2022
579fd89
Update java.sh
galipremsagar Feb 8, 2022
7bb21f6
Update cudf_dev_cuda11.5.yml
galipremsagar Feb 8, 2022
2a6f249
Update meta.yaml
galipremsagar Feb 8, 2022
f8eae6c
Update meta.yaml
galipremsagar Feb 8, 2022
c2c7962
Update get_arrow.cmake
galipremsagar Feb 8, 2022
742934b
Update frame.py
galipremsagar Feb 8, 2022
6ffc181
Update build.sh
galipremsagar Feb 8, 2022
00b7e14
Merge branch 'rapidsai:branch-22.04' into 9645
galipremsagar Feb 8, 2022
da5170f
Merge branch 'rapidsai:branch-22.04' into 9645
galipremsagar Feb 8, 2022
31d3e6e
Merge branch 'rapidsai:branch-22.04' into 9645
galipremsagar Feb 9, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions conda/environments/cudf_dev_cuda11.5.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2021, NVIDIA CORPORATION.
# Copyright (c) 2021-2022, NVIDIA CORPORATION.

name: cudf_dev
channels:
Expand All @@ -17,7 +17,7 @@ dependencies:
- numba>=0.54
- numpy
- pandas>=1.0,<1.4.0dev0
- pyarrow=5.0.0=*cuda
- pyarrow=6.0.1=*cuda
- fastavro>=0.22.9
- python-snappy>=0.6.0
- notebook>=0.5.0
Expand Down Expand Up @@ -45,7 +45,7 @@ dependencies:
- dask>=2021.11.1,<=2022.01.0
- distributed>=2021.11.1,<=2022.01.0
- streamz
- arrow-cpp=5.0.0
- arrow-cpp=6.0.1
- dlpack>=0.5,<0.6.0a0
- arrow-cpp-proc * cuda
- double-conversion
Expand Down
4 changes: 2 additions & 2 deletions conda/recipes/cudf/meta.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2018-2021, NVIDIA CORPORATION.
# Copyright (c) 2018-2022, NVIDIA CORPORATION.

{% set version = environ.get('GIT_DESCRIBE_TAG', '0.0.0.dev').lstrip('v') + environ.get('VERSION_SUFFIX', '') %}
{% set minor_version = version.split('.')[0] + '.' + version.split('.')[1] %}
Expand Down Expand Up @@ -31,7 +31,7 @@ requirements:
- setuptools
- numba >=0.54
- dlpack>=0.5,<0.6.0a0
- pyarrow 5.0.0 *cuda
- pyarrow 6.0.1 *cuda
- libcudf {{ version }}
- rmm {{ minor_version }}
- cudatoolkit {{ cuda_version }}
Expand Down
4 changes: 2 additions & 2 deletions conda/recipes/libcudf/meta.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2018-2021, NVIDIA CORPORATION.
# Copyright (c) 2018-2022, NVIDIA CORPORATION.

{% set version = environ.get('GIT_DESCRIBE_TAG', '0.0.0.dev').lstrip('v') + environ.get('VERSION_SUFFIX', '') %}
{% set minor_version = version.split('.')[0] + '.' + version.split('.')[1] %}
Expand Down Expand Up @@ -40,7 +40,7 @@ requirements:
host:
- librmm {{ minor_version }}.*
- cudatoolkit {{ cuda_version }}.*
- arrow-cpp 5.0.0 *cuda
- arrow-cpp 6.0.1 *cuda
- arrow-cpp-proc * cuda
- dlpack>=0.5,<0.6.0a0
run:
Expand Down
4 changes: 2 additions & 2 deletions cpp/cmake/thirdparty/get_arrow.cmake
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# =============================================================================
# Copyright (c) 2020-2021, NVIDIA CORPORATION.
# Copyright (c) 2020-2022, NVIDIA CORPORATION.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
# in compliance with the License. You may obtain a copy of the License at
Expand Down Expand Up @@ -308,7 +308,7 @@ function(find_and_configure_arrow VERSION BUILD_STATIC ENABLE_S3 ENABLE_ORC ENAB

endfunction()

set(CUDF_VERSION_Arrow 5.0.0)
set(CUDF_VERSION_Arrow 6.0.1)

find_and_configure_arrow(
${CUDF_VERSION_Arrow} ${CUDF_USE_ARROW_STATIC} ${CUDF_ENABLE_ARROW_S3} ${CUDF_ENABLE_ARROW_ORC}
Expand Down
18 changes: 5 additions & 13 deletions python/cudf/cudf/core/column/column.py
Original file line number Diff line number Diff line change
Expand Up @@ -2093,24 +2093,16 @@ def as_column(
dtype = "bool"
np_type = np.dtype(dtype).type
pa_type = np_to_pa_dtype(np.dtype(dtype))
# TODO: A warning is emitted from pyarrow 5.0.0's function
# pyarrow.lib._sequence_to_array:
# "DeprecationWarning: an integer is required (got type float).
# Implicit conversion to integers using __int__ is deprecated,
# and may be removed in a future version of Python."
# This warning does not appear in pyarrow 6.0.1 and will be
# resolved by https://github.com/rapidsai/cudf/pull/9686/.
with warnings.catch_warnings():
warnings.simplefilter("ignore", DeprecationWarning)
pa_array = pa.array(
data = as_column(
pa.array(
arbitrary,
type=pa_type,
from_pandas=True
if nan_as_null is None
else nan_as_null,
)
data = as_column(
pa_array, dtype=dtype, nan_as_null=nan_as_null,
),
dtype=dtype,
nan_as_null=nan_as_null,
)
except (pa.ArrowInvalid, pa.ArrowTypeError, TypeError):
if is_categorical_dtype(dtype):
Expand Down
7 changes: 7 additions & 0 deletions python/cudf/cudf/core/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -4525,10 +4525,17 @@ def to_arrow(self, preserve_index=True):
a: int64
b: int64
index: int64
----
a: [[1,2,3]]
b: [[4,5,6]]
index: [[1,2,3]]
>>> df.to_arrow(preserve_index=False)
pyarrow.Table
a: int64
b: int64
----
a: [[1,2,3]]
b: [[4,5,6]]
"""

data = self.copy(deep=False)
Expand Down
6 changes: 5 additions & 1 deletion python/cudf/cudf/core/frame.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2020-2021, NVIDIA CORPORATION.
# Copyright (c) 2020-2022, NVIDIA CORPORATION.

from __future__ import annotations

Expand Down Expand Up @@ -1916,6 +1916,10 @@ def to_arrow(self):
a: int64
b: int64
index: int64
----
a: [[1,2,3]]
b: [[4,5,6]]
index: [[1,2,3]]
"""
return pa.Table.from_pydict(
{name: col.to_arrow() for name, col in self._data.items()}
Expand Down