Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Incorrect dtype when iterating over dtypes in cudf.pandas #17165

wphicks opened this issue Oct 24, 2024 · 2 comments · Fixed by #17251

[BUG] Incorrect dtype when iterating over dtypes in cudf.pandas #17165

wphicks opened this issue Oct 24, 2024 · 2 comments · Fixed by #17251
bug Something isn't working


Copy link

wphicks commented Oct 24, 2024

Describe the bug
When using cudf.pandas and iterating over the dtypes of a dataframe, categorical dtype objects are reported as cudf.CategoricalDtype and not pandas.CategoricalDtype, causing isinstance checks to fail unexpectedly.

Steps/Code to reproduce bug
Run the following using python -m cudf.pandas and compare to output without cudf.pandas

import pandas as pd

df = pd.DataFrame({"A": ["a", "b", "c", "a"]})
df["A"] = df["A"].astype('category')

print("In for loop: ", [isinstance(t, pd.CategoricalDtype) for t in df.dtypes][0])
print("With iloc: ", isinstance(df.dtypes.iloc[0], pd.CategoricalDtype))
$ python
In for loop:  True
With iloc:  True

$ python -m cudf.pandas
In for loop:  False
With iloc:  True

Expected behavior
Output should be the same for the isinstance checks with and without cudf.pandas and regardless of whether or not we are iterating over dtypes or selecting them by index.

Environment details (please complete the following information):

  • Environment location: GCP g2-standard-8 instance
  • Linux Distro/Architecture: Debian 11 Bullseye amd64
  • GPU Model/Driver: L4 / 550.90.07
  • CUDA: 12.4
  • Method of cuDF & cuML install: conda (RAPIDS 24.10)

conda list Output:

# packages in environment at /opt/conda/envs/rapids-24.10:
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
aiohappyeyeballs          2.4.3              pyhd8ed1ab_0    conda-forge
aiohttp                   3.10.10         py312h178313f_0    conda-forge
aiosignal                 1.3.1              pyhd8ed1ab_0    conda-forge
anyio                     4.6.2.post1        pyhd8ed1ab_0    conda-forge
aom                       3.9.1                hac33072_0    conda-forge
argon2-cffi               23.1.0             pyhd8ed1ab_0    conda-forge
argon2-cffi-bindings      21.2.0          py312h66e93f0_5    conda-forge
arrow                     1.3.0              pyhd8ed1ab_0    conda-forge
asttokens                 2.4.1              pyhd8ed1ab_0    conda-forge
async-lru                 2.0.4              pyhd8ed1ab_0    conda-forge
attrs                     24.2.0             pyh71513ae_0    conda-forge
aws-c-auth                0.7.31               hd5d0ea3_3    conda-forge
aws-c-cal                 0.7.4                hae4d56a_2    conda-forge
aws-c-common              0.9.29               hb9d3cd8_0    conda-forge
aws-c-compression         0.2.19               h2bff981_2    conda-forge
aws-c-event-stream        0.4.3                h6c1f5b1_5    conda-forge
aws-c-http                0.8.10               hf2c527e_3    conda-forge
aws-c-io                  0.14.20              hc9e6898_0    conda-forge
aws-c-mqtt                0.10.7               hfbb250a_3    conda-forge
aws-c-s3                  0.6.7                h7f2cdf9_1    conda-forge
aws-c-sdkutils            0.1.19               h2bff981_4    conda-forge
aws-checksums             0.1.20               h2bff981_1    conda-forge
aws-crt-cpp               0.28.5               h5cd2d59_1    conda-forge
aws-sdk-cpp               1.11.407             h9eeb5ce_2    conda-forge
azure-core-cpp            1.14.0               h5cfcd09_0    conda-forge
azure-identity-cpp        1.10.0               h113e628_0    conda-forge
azure-storage-blobs-cpp   12.13.0              h3cf044e_1    conda-forge
azure-storage-common-cpp  12.8.0               h736e048_1    conda-forge
azure-storage-files-datalake-cpp 12.12.0              ha633028_1    conda-forge
babel                     2.14.0             pyhd8ed1ab_0    conda-forge
beautifulsoup4            4.12.3             pyha770c72_0    conda-forge
bleach                    6.1.0              pyhd8ed1ab_0    conda-forge
blosc                     1.21.6               hef167b5_0    conda-forge
bokeh                     3.6.0              pyhd8ed1ab_0    conda-forge
branca                    0.7.2              pyhd8ed1ab_0    conda-forge
brotli                    1.1.0                hb9d3cd8_2    conda-forge
brotli-bin                1.1.0                hb9d3cd8_2    conda-forge
brotli-python             1.1.0           py312h2ec8cdc_2    conda-forge
brunsli                   0.1                  h9c3ff4c_0    conda-forge
bzip2                     1.0.8                h4bc722e_7    conda-forge
c-ares                    1.34.2               heb4867d_0    conda-forge
c-blosc2                  2.15.1               hc57e6cf_0    conda-forge
ca-certificates           2024.8.30            hbcca054_0    conda-forge
cached-property           1.5.2                hd8ed1ab_1    conda-forge
cached_property           1.5.2              pyha770c72_1    conda-forge
cachetools                5.5.0              pyhd8ed1ab_0    conda-forge
certifi                   2024.8.30          pyhd8ed1ab_0    conda-forge
cffi                      1.17.1          py312h06ac9bb_0    conda-forge
charls                    2.4.2                h59595ed_0    conda-forge
charset-normalizer        3.4.0              pyhd8ed1ab_0    conda-forge
click                     8.1.7           unix_pyh707e725_0    conda-forge
cloudpickle               3.1.0              pyhd8ed1ab_1    conda-forge
colorama                  0.4.6              pyhd8ed1ab_0    conda-forge
colorcet                  3.1.0              pyhd8ed1ab_0    conda-forge
comm                      0.2.2              pyhd8ed1ab_0    conda-forge
contourpy                 1.3.0           py312h68727a3_2    conda-forge
cucim                     24.10.00        cuda12_py312_241009_gf08280c_0    rapidsai
cuda-cccl_linux-64        12.5.39              ha770c72_0    conda-forge
cuda-crt-dev_linux-64     12.5.82              ha770c72_0    conda-forge
cuda-crt-tools            12.5.82              ha770c72_0    conda-forge
cuda-cudart               12.5.82              he02047a_0    conda-forge
cuda-cudart-dev           12.5.82              he02047a_0    conda-forge
cuda-cudart-dev_linux-64  12.5.82              h85509e4_0    conda-forge
cuda-cudart-static        12.5.82              he02047a_0    conda-forge
cuda-cudart-static_linux-64 12.5.82              h85509e4_0    conda-forge
cuda-cudart_linux-64      12.5.82              h85509e4_0    conda-forge
cuda-nvcc-dev_linux-64    12.5.82              ha770c72_0    conda-forge
cuda-nvcc-impl            12.5.82              hd3aeb46_0    conda-forge
cuda-nvcc-tools           12.5.82              hd3aeb46_0    conda-forge
cuda-nvrtc                12.5.82              he02047a_0    conda-forge
cuda-nvvm-dev_linux-64    12.5.82              ha770c72_0    conda-forge
cuda-nvvm-impl            12.5.82              h59595ed_0    conda-forge
cuda-nvvm-tools           12.5.82              h59595ed_0    conda-forge
cuda-profiler-api         12.5.39              ha770c72_0    conda-forge
cuda-python               12.6.0          py312he9d8a76_1    conda-forge
cuda-version              12.5                 hd4f0392_3    conda-forge
cudf                      24.10.01        cuda12_py312_241009_g7b0adfa253_0    rapidsai
cudf_kafka                24.10.01        cuda12_py312_241009_g7b0adfa253_0    rapidsai
cugraph                   24.10.00        cuda12_py312_241009_g6f2510afa_0    rapidsai
cuml                      24.10.00        cuda12_py312_241009_gba7e3ab9c_0    rapidsai
cuproj                    24.10.00        cuda12_py312_241009_g73184efb_0    rapidsai
cupy                      13.3.0          py312h7d319b9_2    conda-forge
cupy-core                 13.3.0          py312h1acd1a8_2    conda-forge
cuspatial                 24.10.00        cuda12_py312_241009_g73184efb_0    rapidsai
custreamz                 24.10.01        cuda12_py312_241009_g7b0adfa253_0    rapidsai
cuvs                      24.10.00        cuda12_py312_241009_g7de3a05_0    rapidsai
cuxfilter                 24.10.00        cuda12_py312_241009_g8c10c79_0    rapidsai
cycler                    0.12.1             pyhd8ed1ab_0    conda-forge
cyrus-sasl                2.1.27               h54b06d7_7    conda-forge
cytoolz                   1.0.0           py312h66e93f0_1    conda-forge
dask                      2024.9.0           pyhd8ed1ab_0    conda-forge
dask-core                 2024.9.0           pyhd8ed1ab_0    conda-forge
dask-cuda                 24.10.00        py312_241009_g4e45758_0    rapidsai
dask-cudf                 24.10.01        cuda12_py312_241009_g7b0adfa253_0    rapidsai
dask-expr                 1.1.14             pyhd8ed1ab_0    conda-forge
datashader                0.16.3             pyhd8ed1ab_0    conda-forge
dav1d                     1.2.1                hd590300_0    conda-forge
debugpy                   1.8.7           py312h2ec8cdc_0    conda-forge
decorator                 5.1.1              pyhd8ed1ab_0    conda-forge
defusedxml                0.7.1              pyhd8ed1ab_0    conda-forge
distributed               2024.9.0           pyhd8ed1ab_0    conda-forge
distributed-ucxx          0.40.00         py3.12_241009_g152901c_0    rapidsai
dlpack                    0.8                  h59595ed_3    conda-forge
entrypoints               0.4                pyhd8ed1ab_0    conda-forge
exceptiongroup            1.2.2              pyhd8ed1ab_0    conda-forge
executing                 2.1.0              pyhd8ed1ab_0    conda-forge
fastrlock                 0.8.2           py312h30efb56_2    conda-forge
fmt                       11.0.2               h434a139_0    conda-forge
folium                    0.17.0             pyhd8ed1ab_0    conda-forge
fonttools                 4.54.1          py312h178313f_1    conda-forge
fqdn                      1.5.1              pyhd8ed1ab_0    conda-forge
freetype                  2.12.1               h267a509_2    conda-forge
freexl                    2.0.0                h743c826_0    conda-forge
frozenlist                1.4.1           py312h66e93f0_1    conda-forge
fsspec                    2024.10.0          pyhff2d567_0    conda-forge
geopandas                 1.0.1              pyhd8ed1ab_1    conda-forge
geopandas-base            1.0.1              pyha770c72_1    conda-forge
geos                      3.13.0               h5888daf_0    conda-forge
geotiff                   1.7.3                h77b800c_3    conda-forge
gflags                    2.2.2             h5888daf_1005    conda-forge
giflib                    5.2.2                hd590300_0    conda-forge
glog                      0.7.1                hbabe93e_0    conda-forge
h11                       0.14.0             pyhd8ed1ab_0    conda-forge
h2                        4.1.0              pyhd8ed1ab_0    conda-forge
holoviews                 1.19.1             pyhd8ed1ab_0    conda-forge
hpack                     4.0.0              pyh9f0ad1d_0    conda-forge
httpcore                  1.0.6              pyhd8ed1ab_0    conda-forge
httpx                     0.27.2             pyhd8ed1ab_0    conda-forge
hyperframe                6.0.1              pyhd8ed1ab_0    conda-forge
icu                       75.1                 he02047a_0    conda-forge
idna                      3.10               pyhd8ed1ab_0    conda-forge
imagecodecs               2024.9.22       py312hf6703b6_0    conda-forge
imageio                   2.36.0             pyh12aca89_1    conda-forge
importlib-metadata        8.5.0              pyha770c72_0    conda-forge
importlib_metadata        8.5.0                hd8ed1ab_0    conda-forge
importlib_resources       6.4.5              pyhd8ed1ab_0    conda-forge
ipykernel                 6.29.5             pyh3099207_0    conda-forge
ipython                   8.28.0             pyh707e725_0    conda-forge
ipywidgets                8.1.5              pyhd8ed1ab_0    conda-forge
isoduration               20.11.0            pyhd8ed1ab_0    conda-forge
jedi                      0.19.1             pyhd8ed1ab_0    conda-forge
jinja2                    3.1.4              pyhd8ed1ab_0    conda-forge
joblib                    1.4.2              pyhd8ed1ab_0    conda-forge
json-c                    0.18                 h6688a6e_0    conda-forge
json5                     0.9.25             pyhd8ed1ab_0    conda-forge
jsonpointer               3.0.0           py312h7900ff3_1    conda-forge
jsonschema                4.23.0             pyhd8ed1ab_0    conda-forge
jsonschema-specifications 2024.10.1          pyhd8ed1ab_0    conda-forge
jsonschema-with-format-nongpl 4.23.0               hd8ed1ab_0    conda-forge
jupyter                   1.1.1              pyhd8ed1ab_0    conda-forge
jupyter-lsp               2.2.5              pyhd8ed1ab_0    conda-forge
jupyter-server-proxy      4.4.0              pyhd8ed1ab_0    conda-forge
jupyter_client            8.6.3              pyhd8ed1ab_0    conda-forge
jupyter_console           6.6.3              pyhd8ed1ab_0    conda-forge
jupyter_core              5.7.2              pyh31011fe_1    conda-forge
jupyter_events            0.10.0             pyhd8ed1ab_0    conda-forge
jupyter_server            2.14.2             pyhd8ed1ab_0    conda-forge
jupyter_server_terminals  0.5.3              pyhd8ed1ab_0    conda-forge
jupyterlab                4.2.5              pyhd8ed1ab_0    conda-forge
jupyterlab_pygments       0.3.0              pyhd8ed1ab_1    conda-forge
jupyterlab_server         2.27.3             pyhd8ed1ab_0    conda-forge
jupyterlab_widgets        3.0.13             pyhd8ed1ab_0    conda-forge
jxrlib                    1.1                  hd590300_3    conda-forge
keyutils                  1.6.1                h166bdaf_0    conda-forge
kiwisolver                1.4.7           py312h68727a3_0    conda-forge
krb5                      1.21.3               h659f571_0    conda-forge
lazy-loader               0.4                pyhd8ed1ab_1    conda-forge
lazy_loader               0.4                pyhd8ed1ab_1    conda-forge
lcms2                     2.16                 hb7c19ff_0    conda-forge
ld_impl_linux-64          2.43                 h712a8e2_1    conda-forge
lerc                      4.0.0                h27087fc_0    conda-forge
libabseil                 20240722.0      cxx17_h5888daf_1    conda-forge
libaec                    1.1.3                h59595ed_0    conda-forge
libarchive                3.7.4                hfca40fe_0    conda-forge
libarrow                  17.0.0          hdb9dd6d_23_cpu    conda-forge
libarrow-acero            17.0.0          h5888daf_23_cpu    conda-forge
libarrow-dataset          17.0.0          h5888daf_23_cpu    conda-forge
libarrow-substrait        17.0.0          he882d9a_23_cpu    conda-forge
libavif16                 1.1.1                h104a339_1    conda-forge
libblas                   3.9.0           24_linux64_openblas    conda-forge
libbrotlicommon           1.1.0                hb9d3cd8_2    conda-forge
libbrotlidec              1.1.0                hb9d3cd8_2    conda-forge
libbrotlienc              1.1.0                hb9d3cd8_2    conda-forge
libcblas                  3.9.0           24_linux64_openblas    conda-forge
libcrc32c                 1.1.2                h9c3ff4c_0    conda-forge
libcublas                    he02047a_0    conda-forge
libcublas-dev                he02047a_0    conda-forge
libcucim                  24.10.00        cuda12_241009_gf08280c_0    rapidsai
libcudf                   24.10.01        cuda12_241009_g7b0adfa253_0    rapidsai
libcudf_kafka             24.10.01        cuda12_241009_g7b0adfa253_0    rapidsai
libcufft                    he02047a_0    conda-forge
libcufile                    he02047a_0    conda-forge
libcufile-dev                he02047a_0    conda-forge
libcugraph                24.10.00        cuda12_241009_g6f2510afa_0    rapidsai
libcugraph_etl            24.10.00        cuda12_241009_g6f2510afa_0    rapidsai
libcugraphops             24.10.00        cuda12_241009_g7057cc73_0    rapidsai
libcuml                   24.10.00        cuda12_241009_gba7e3ab9c_0    rapidsai
libcumlprims              24.10.00        cuda12_241009_g0848871_0    rapidsai
libcurand                   he02047a_0    conda-forge
libcurand-dev               he02047a_0    conda-forge
libcurl                   8.10.1               hbbe4b11_0    conda-forge
libcusolver                 he02047a_0    conda-forge
libcusolver-dev             he02047a_0    conda-forge
libcusparse                  he02047a_0    conda-forge
libcusparse-dev              he02047a_0    conda-forge
libcuspatial              24.10.00        cuda12_241009_g73184efb_0    rapidsai
libcuvs                   24.10.00        cuda12_241009_g7de3a05_0    rapidsai
libdeflate                1.22                 hb9d3cd8_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 hd590300_2    conda-forge
libevent                  2.1.12               hf998b51_1    conda-forge
libexpat                  2.6.3                h5888daf_0    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc                    14.2.0               h77fa898_1    conda-forge
libgcc-ng                 14.2.0               h69a702a_1    conda-forge
libgdal-core              3.9.2                hd5b9bfb_7    conda-forge
libgfortran               14.2.0               h69a702a_1    conda-forge
libgfortran-ng            14.2.0               h69a702a_1    conda-forge
libgfortran5              14.2.0               hd5240d6_1    conda-forge
libgomp                   14.2.0               h77fa898_1    conda-forge
libgoogle-cloud           2.30.0               h438788a_0    conda-forge
libgoogle-cloud-storage   2.30.0               h0121fbd_0    conda-forge
libgrpc                   1.65.5               hf5c653b_0    conda-forge
libhwy                    1.1.0                h00ab1b0_0    conda-forge
libiconv                  1.17                 hd590300_2    conda-forge
libjpeg-turbo             3.0.0                hd590300_1    conda-forge
libjxl                    0.11.0               hdb8da77_2    conda-forge
libkml                    1.3.0             hf539b9f_1021    conda-forge
libkvikio                 24.10.00        cuda12_241009_g85a88a2_0    rapidsai
liblapack                 3.9.0           24_linux64_openblas    conda-forge
libllvm14                 14.0.6               hcd5def8_4    conda-forge
libnghttp2                1.64.0               h161d5f1_0    conda-forge
libnl                     3.10.0               h4bc722e_0    conda-forge
libnsl                    2.0.1                hd590300_0    conda-forge
libntlm                   1.4               h7f98852_1002    conda-forge
libnvjitlink              12.5.82              he02047a_0    conda-forge
libnvjpeg                   he02047a_0    conda-forge
libopenblas               0.3.27          pthreads_hac2b453_1    conda-forge
libparquet                17.0.0          h6bd9018_23_cpu    conda-forge
libpng                    1.6.44               hadc24fc_0    conda-forge
libprotobuf               5.27.5               h5b01275_2    conda-forge
libraft                   24.10.00        cuda12_241009_g397042a0_0    rapidsai
libraft-headers           24.10.00        cuda12_241009_g397042a0_0    rapidsai
libraft-headers-only      24.10.00        cuda12_241009_g397042a0_0    rapidsai
librdkafka                2.5.3                h95ba008_0    conda-forge
libre2-11                 2024.07.02           hbbce691_1    conda-forge
librmm                    24.10.00        cuda12_241009_g3223f841_0    rapidsai
librttopo                 1.1.0               h97f6797_17    conda-forge
libsodium                 1.0.20               h4ab18f5_0    conda-forge
libspatialite             5.1.0               h1b4f908_11    conda-forge
libsqlite                 3.47.0               hadc24fc_0    conda-forge
libssh2                   1.11.0               h0841786_0    conda-forge
libstdcxx                 14.2.0               hc0a3c3a_1    conda-forge
libstdcxx-ng              14.2.0               h4852527_1    conda-forge
libthrift                 0.21.0               h0e7cc3e_0    conda-forge
libtiff                   4.7.0                he137b08_1    conda-forge
libucxx                   0.40.00         cuda12_241009_g152901c_0    rapidsai
libutf8proc               2.8.0                h166bdaf_0    conda-forge
libuuid                   2.38.1               h0b41bf4_0    conda-forge
libuv                     1.49.2               hb9d3cd8_0    conda-forge
libwebp-base              1.4.0                hd590300_0    conda-forge
libxcb                    1.17.0               h8a09558_0    conda-forge
libxcrypt                 4.4.36               hd590300_1    conda-forge
libxgboost                2.1.1           rapidsai_h01f03eb_5    rapidsai
libxml2                   2.12.7               he7c6b58_4    conda-forge
libzlib                   1.3.1                hb9d3cd8_2    conda-forge
libzopfli                 1.0.3                h9c3ff4c_0    conda-forge
linkify-it-py             2.0.3              pyhd8ed1ab_0    conda-forge
llvmlite                  0.43.0          py312h374181b_1    conda-forge
locket                    1.0.0              pyhd8ed1ab_0    conda-forge
lz4                       4.3.3           py312hb3f7f12_1    conda-forge
lz4-c                     1.9.4                hcb278e6_0    conda-forge
lzo                       2.10              hd590300_1001    conda-forge
mapclassify               2.8.1              pyhd8ed1ab_0    conda-forge
markdown                  3.6                pyhd8ed1ab_0    conda-forge
markdown-it-py            3.0.0              pyhd8ed1ab_0    conda-forge
markupsafe                3.0.2           py312h178313f_0    conda-forge
matplotlib-base           3.9.2           py312hd3ec401_1    conda-forge
matplotlib-inline         0.1.7              pyhd8ed1ab_0    conda-forge
mdit-py-plugins           0.4.2              pyhd8ed1ab_0    conda-forge
mdurl                     0.1.2              pyhd8ed1ab_0    conda-forge
minizip                   4.0.7                h401b404_0    conda-forge
mistune                   3.0.2              pyhd8ed1ab_0    conda-forge
msgpack-python            1.1.0           py312h68727a3_0    conda-forge
multidict                 6.1.0           py312h178313f_1    conda-forge
multipledispatch          0.6.0              pyhd8ed1ab_1    conda-forge
munkres                   1.1.4              pyh9f0ad1d_0    conda-forge
nbclient                  0.10.0             pyhd8ed1ab_0    conda-forge
nbconvert-core            7.16.4             pyhd8ed1ab_1    conda-forge
nbformat                  5.10.4             pyhd8ed1ab_0    conda-forge
nccl                         h52f6c39_1    conda-forge
ncurses                   6.5                  he02047a_1    conda-forge
nest-asyncio              1.6.0              pyhd8ed1ab_0    conda-forge
networkx                  3.4.2              pyhd8ed1ab_0    conda-forge
nodejs                    22.9.0               hf235a45_0    conda-forge
notebook                  7.2.2              pyhd8ed1ab_0    conda-forge
notebook-shim             0.2.4              pyhd8ed1ab_0    conda-forge
numba                     0.60.0          py312h83e6fd3_0    conda-forge
numpy                     2.0.2           py312h58c1407_0    conda-forge
nvcomp                    4.0.1                hbc370b7_0    conda-forge
nvtx                      0.2.10          py312h66e93f0_2    conda-forge
nx-cugraph                24.10.00        py312_241009_g6f2510afa_0    rapidsai
openjpeg                  2.5.2                h488ebb8_0    conda-forge
openssl                   3.3.2                hb9d3cd8_0    conda-forge
orc                       2.0.2                h690cf93_1    conda-forge
overrides                 7.7.0              pyhd8ed1ab_0    conda-forge
packaging                 24.1               pyhd8ed1ab_0    conda-forge
pandas                    2.2.2           py312h1d6d2e6_1    conda-forge
pandocfilters             1.5.0              pyhd8ed1ab_0    conda-forge
panel                     1.5.2              pyhd8ed1ab_0    conda-forge
param                     2.1.1              pyhff2d567_0    conda-forge
parso                     0.8.4              pyhd8ed1ab_0    conda-forge
partd                     1.4.2              pyhd8ed1ab_0    conda-forge
pcre2                     10.44                hba22ea6_2    conda-forge
pexpect                   4.9.0              pyhd8ed1ab_0    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pillow                    11.0.0          py312h7b63e92_0    conda-forge
pip                       24.2               pyh8b19718_1    conda-forge
pkgutil-resolve-name      1.3.10             pyhd8ed1ab_1    conda-forge
platformdirs              4.3.6              pyhd8ed1ab_0    conda-forge
proj                      9.5.0                h12925eb_0    conda-forge
prometheus_client         0.21.0             pyhd8ed1ab_0    conda-forge
prompt-toolkit            3.0.48             pyha770c72_0    conda-forge
prompt_toolkit            3.0.48               hd8ed1ab_0    conda-forge
propcache                 0.2.0           py312h66e93f0_2    conda-forge
psutil                    6.0.0           py312h66e93f0_2    conda-forge
pthread-stubs             0.4               hb9d3cd8_1002    conda-forge
ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
pure_eval                 0.2.3              pyhd8ed1ab_0    conda-forge
py-xgboost                2.1.1           rapidsai_pyh53d8b89_5    rapidsai
pyarrow                   17.0.0          py312h9cebb41_1    conda-forge
pyarrow-core              17.0.0          py312h9cafe31_1_cpu    conda-forge
pycparser                 2.22               pyhd8ed1ab_0    conda-forge
pyct                      0.5.0              pyhd8ed1ab_0    conda-forge
pygments                  2.18.0             pyhd8ed1ab_0    conda-forge
pylibcudf                 24.10.01        cuda12_py312_241009_g7b0adfa253_0    rapidsai
pylibcugraph              24.10.00        cuda12_py312_241009_g6f2510afa_0    rapidsai
pylibraft                 24.10.00        cuda12_py312_241009_g397042a0_0    rapidsai
pynvjitlink               0.3.0           py312hd269673_0    rapidsai
pynvml                    11.4.1             pyhd8ed1ab_0    conda-forge
pyogrio                   0.10.0          py312he8b4914_0    conda-forge
pyparsing                 3.2.0              pyhd8ed1ab_1    conda-forge
pyproj                    3.7.0           py312he630544_0    conda-forge
pysocks                   1.7.1              pyha2e5f31_6    conda-forge
python                    3.12.7          hc5c86c4_0_cpython    conda-forge
python-confluent-kafka    2.5.3           py312h66e93f0_0    conda-forge
python-dateutil           2.9.0              pyhd8ed1ab_0    conda-forge
python-fastjsonschema     2.20.0             pyhd8ed1ab_0    conda-forge
python-json-logger        2.0.7              pyhd8ed1ab_0    conda-forge
python-tzdata             2024.2             pyhd8ed1ab_0    conda-forge
python_abi                3.12                    5_cp312    conda-forge
pytz                      2024.2             pyhd8ed1ab_0    conda-forge
pyviz_comms               3.0.3              pyhd8ed1ab_0    conda-forge
pywavelets                1.7.0           py312hc0a28a1_2    conda-forge
pyyaml                    6.0.2           py312h66e93f0_1    conda-forge
pyzmq                     26.2.0          py312hbf22597_3    conda-forge
qhull                     2020.2               h434a139_5    conda-forge
raft-dask                 24.10.00        cuda12_py312_241009_g397042a0_0    rapidsai
rapids                    24.10.00        cuda12_py312_241009_g19a0c5a_0    rapidsai
rapids-dask-dependency    24.10.00                   py_0    rapidsai
rapids-xgboost            24.10.00        cuda12_py312_241009_g19a0c5a_0    rapidsai
rav1e                     0.6.6                he8a937b_2    conda-forge
rdma-core                 54.0                 h5888daf_0    conda-forge
re2                       2024.07.02           h77b4e00_1    conda-forge
readline                  8.2                  h8228510_1    conda-forge
referencing               0.35.1             pyhd8ed1ab_0    conda-forge
requests                  2.32.3             pyhd8ed1ab_0    conda-forge
rfc3339-validator         0.1.4              pyhd8ed1ab_0    conda-forge
rfc3986-validator         0.1.1              pyh9f0ad1d_0    conda-forge
rich                      13.9.3             pyhd8ed1ab_0    conda-forge
rmm                       24.10.00        cuda12_py312_241009_g3223f841_0    rapidsai
rpds-py                   0.20.0          py312h12e396e_1    conda-forge
s2n                       1.5.5                h3931f03_0    conda-forge
scikit-image              0.24.0          py312h1df14c2_2    conda-forge
scikit-learn              1.5.2           py312h7a48858_1    conda-forge
scipy                     1.14.1          py312h62794b6_1    conda-forge
send2trash                1.8.3              pyh0d859eb_0    conda-forge
setuptools                75.1.0             pyhd8ed1ab_0    conda-forge
shapely                   2.0.6           py312h391bc85_2    conda-forge
simpervisor               1.0.0              pyhd8ed1ab_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
snappy                    1.2.1                ha2e4443_0    conda-forge
sniffio                   1.3.1              pyhd8ed1ab_0    conda-forge
sortedcontainers          2.4.0              pyhd8ed1ab_0    conda-forge
soupsieve                 2.5                pyhd8ed1ab_1    conda-forge
spdlog                    1.14.1               hed91bc2_1    conda-forge
sqlite                    3.47.0               h9eae976_0    conda-forge
stack_data                0.6.2              pyhd8ed1ab_0    conda-forge
streamz                   0.6.4              pyh6c4a22f_0    conda-forge
svt-av1                   2.2.1                h5888daf_0    conda-forge
tblib                     3.0.0              pyhd8ed1ab_0    conda-forge
terminado                 0.18.1             pyh0d859eb_0    conda-forge
threadpoolctl             3.5.0              pyhc1e730c_0    conda-forge
tifffile                  2024.9.20          pyhd8ed1ab_0    conda-forge
tinycss2                  1.3.0              pyhd8ed1ab_0    conda-forge
tk                        8.6.13          noxft_h4845f30_101    conda-forge
tomli                     2.0.2              pyhd8ed1ab_0    conda-forge
toolz                     1.0.0              pyhd8ed1ab_0    conda-forge
tornado                   6.4.1           py312h66e93f0_1    conda-forge
tqdm                      4.66.5             pyhd8ed1ab_0    conda-forge
traitlets                 5.14.3             pyhd8ed1ab_0    conda-forge
treelite                  4.3.0           py312h01abfbf_0    conda-forge
types-python-dateutil     pyhff2d567_0    conda-forge
typing-extensions         4.12.2               hd8ed1ab_0    conda-forge
typing_extensions         4.12.2             pyha770c72_0    conda-forge
typing_utils              0.1.0              pyhd8ed1ab_0    conda-forge
tzdata                    2024b                hc8b5060_0    conda-forge
uc-micro-py               1.0.3              pyhd8ed1ab_0    conda-forge
ucx                       1.17.0               h05e919c_3    conda-forge
ucx-proc                  1.0.0                       gpu    rapidsai
ucx-py                    0.40.00         py312_241009_g773cd1e_0    rapidsai
ucxx                      0.40.00         cuda12_py3.12_241009_g152901c_0    rapidsai
unicodedata2              15.1.0          py312h98912ed_0    conda-forge
uri-template              1.3.0              pyhd8ed1ab_0    conda-forge
uriparser                 0.9.8                hac33072_0    conda-forge
urllib3                   2.2.3              pyhd8ed1ab_0    conda-forge
wcwidth                   0.2.13             pyhd8ed1ab_0    conda-forge
webcolors                 24.8.0             pyhd8ed1ab_0    conda-forge
webencodings              0.5.1              pyhd8ed1ab_2    conda-forge
websocket-client          1.8.0              pyhd8ed1ab_0    conda-forge
wheel                     0.44.0             pyhd8ed1ab_0    conda-forge
widgetsnbextension        4.0.13             pyhd8ed1ab_0    conda-forge
xarray                    2024.9.0           pyhd8ed1ab_1    conda-forge
xerces-c                  3.2.5                h988505b_2    conda-forge
xgboost                   2.1.1           rapidsai_pyh9bdd636_5    rapidsai
xorg-libxau               1.0.11               hb9d3cd8_1    conda-forge
xorg-libxdmcp             1.1.5                hb9d3cd8_0    conda-forge
xyzservices               2024.9.0           pyhd8ed1ab_0    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
yaml                      0.2.5                h7f98852_2    conda-forge
yarl                      1.15.5          py312h66e93f0_0    conda-forge
zeromq                    4.3.5                h3b0a872_6    conda-forge
zfp                       1.0.1                h5888daf_2    conda-forge
zict                      3.0.0              pyhd8ed1ab_0    conda-forge
zipp                      3.20.2             pyhd8ed1ab_0    conda-forge
zlib                      1.3.1                hb9d3cd8_2    conda-forge
zlib-ng                   2.2.2                h5888daf_0    conda-forge
zstandard                 0.23.0          py312hef9b889_1    conda-forge
zstd                      1.5.6                ha6fb4c9_0    conda-forge

Additional context
This prevents training an XGBoost model on categorical variables using cudf.pandas if the .plot method of a Series has been called beforehand. See #17166 for information on unexpected behavior from .plot.

Copy link

Matt711 commented Oct 24, 2024

I think the problem is due to our custom function for __iter__ in our pd.Series proxy type. The loop for t in df.dtypes calls __iter__ which (for our proxy type) always uses the underlying slow objects __iter__ method. I'm not sure why we're using a custom iterator for pd.Series, maybe we shouldn't?

Copy link

Matt711 commented Oct 24, 2024

Okay removing the custom iterator made your minimum repro work, but It could break other things (we'll see).

In [1]: %load_ext cudf.pandas

In [2]: import pandas as pd
   ...: df = pd.DataFrame({"A": ["a", "b", "c", "a"]})
   ...: df["A"] = df["A"].astype('category')
   ...: print("In for loop: ", [isinstance(t, pd.CategoricalDtype) for t in df.dtypes][0])
   ...: print("With iloc: ", isinstance(df.dtypes.iloc[0], pd.CategoricalDtype))
In for loop:  True
With iloc:  True

@galipremsagar galipremsagar self-assigned this Nov 7, 2024
rapids-bot bot pushed a commit that referenced this issue Nov 8, 2024
Fixes: #17165 
Fixes: #14481

This PR properly wraps the result of custom iterator. 

In [2]: import pandas as pd

In [3]: s = pd.Series([10, 1, 2, 3, 4, 5]*1000000)

# Without custom_iter:

In [4]: %timeit for i in s: True
6.34 s ± 25.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

# This PR:

In [4]: %timeit for i in s: True
6.16 s ± 17.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

# On `branch-24.12`:
1.53 s ± 6.27 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

I think `custom_iter` has to exist. Here is why, invoking any sort of `iteration` on GPU objects will raise errors and thus in the end we fall-back to CPU. Instead of trying to move the objects from host to device memory (if the object is on host memory only), we will avoid a CPU-to-GPU transfer.


  - Matthew Murray (

URL: #17251
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
bug Something isn't working
None yet
3 participants