Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement basic COALESCE functionality #823

Merged
merged 9 commits into from
Dec 1, 2022

Conversation

ChrisJar
Copy link
Collaborator

No description provided.

@randerzander
Copy link
Collaborator

Will depend on dask/dask#9563 and related dask-cudf work

@ChrisJar ChrisJar marked this pull request as ready for review November 9, 2022 16:22
Comment on lines +460 to +469
if aggregation_name == "sum" and isinstance(df._meta, pd.DataFrame):
aggregation_function = AggregationSpecification(
dd.Aggregation(
name="custom_sum",
chunk=lambda s: s.sum(min_count=1),
agg=lambda s0: s0.sum(min_count=1),
)
)
else:
aggregation_function = self.AGGREGATION_MAPPING[aggregation_name]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we throw in a comment here giving the reasoning for this override (IIUC differences in nullable "object" columns between cuDF and pandas)?

dask_sql/physical/rex/core/call.py Outdated Show resolved Hide resolved
Comment on lines 360 to 375
df = dd.from_pandas(pd.DataFrame({"a": [1], "b": [np.nan]}), npartitions=1)
c.create_table("df", df, gpu=gpu)

df = c.sql(
"""
SELECT
COALESCE(3, 5) as c1,
COALESCE(NULL, NULL) as c2,
COALESCE(NULL, 'hi') as c3,
COALESCE(NULL, NULL, 'bye', 5/0) as c4,
COALESCE(NULL, 3/2, NULL, 'fly') as c5,
COALESCE(SUM(b), 'why', 2.2) as c6,
COALESCE(NULL, MEAN(b), MEAN(a), 4/0) as c7
FROM df
"""
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following up the previous comment, it might be good to add tests using COALESCE on columns rather than only scalars so that we have coverage for that specific case

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With 2000932 in, I am now trying to add something like COALESCE(a, b) to this query, but am running into parsing issues; a minimal example of the error I'm getting:

import numpy as np
import pandas as pd

from dask_sql import Context

c = Context()
c.create_table("df", pd.DataFrame({
    "a": [np.nan, 1, np.nan],
    "b": [np.nan, np.nan, 2]
}))

c.sql("""
    select
        coalesce(a, b) as c,
        coalesce(sum(b), 'why') as d
    from df
""")
# ParsingException: Plan("Projection references non-aggregate values: Expression df.a could not be resolved from available columns: SUM(df.b)")

cc @andygrove

@codecov-commenter
Copy link

codecov-commenter commented Nov 23, 2022

Codecov Report

Merging #823 (22cf6ad) into main (d2896fa) will increase coverage by 0.26%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #823      +/-   ##
==========================================
+ Coverage   75.67%   75.94%   +0.26%     
==========================================
  Files          73       73              
  Lines        4050     4065      +15     
  Branches      731      737       +6     
==========================================
+ Hits         3065     3087      +22     
+ Misses        824      814      -10     
- Partials      161      164       +3     
Impacted Files Coverage Δ
dask_sql/physical/rel/logical/aggregate.py 90.10% <100.00%> (+0.10%) ⬆️
dask_sql/physical/rex/core/call.py 81.50% <100.00%> (+0.47%) ⬆️
dask_sql/_version.py 35.31% <0.00%> (+1.41%) ⬆️
dask_sql/mappings.py 89.21% <0.00%> (+1.96%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

COALESCE(NULL, 'hi') as c3,
COALESCE(NULL, NULL, 'bye', 5/0) as c4,
COALESCE(NULL, 3/2, NULL, 'fly') as c5,
COALESCE(SUM(b), 'why', 2.2) as c6,
Copy link
Collaborator Author

@ChrisJar ChrisJar Nov 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ayushdg @charlesbluca This line is now failing on gpu with the newest dask-sql environment. Specifically,

SELECT COALESCE(SUM(b), 'why', 2.2) FROM df

throws:

ValueError: could not convert string to float: 'why'

This doesn't fail on CPU nor does it fail on the roughly equivalent query

SELECT COALESCE(NULL, 'why', 2.2) FROM df

Any Idea what might be happening? Could this be due to a change to cudf?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking a look right now

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you list the cuDF conda packages you're using (assuming this is using conda packages and not source)? I pulled in the latest 22.12 nightlies and wasn't able to reproduce:

# packages in environment at /raid/charlesb/mambaforge/envs/basic-coalesce:
#
# Name                    Version                   Build  Channel
cudf                      22.12.00a221130 cuda_11_py39_geb271044c2_307    rapidsai-nightly
dask-cudf                 22.12.00a221130 cuda_11_py39_geb271044c2_307    rapidsai-nightly
libcudf                   22.12.00a221130 cuda11_geb271044c2_307    rapidsai-nightly

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep here they are:

cudf                      22.12.00a221130 cuda_11_py39_geb271044c2_307    rapidsai-nightly
dask-cudf                 22.12.00a221130 cuda_11_py39_geb271044c2_307    rapidsai-nightly
libcudf                   22.12.00a221130 cuda11_geb271044c2_307    rapidsai-nightly

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here's my full environment

# packages in environment at /raid/cjarrett/miniconda3/envs/dask-sql-11-30:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
_py-xgboost-mutex         2.0                       cpu_0    conda-forge
adagio                    0.2.4              pyhd8ed1ab_0    conda-forge
alabaster                 0.7.12                     py_0    conda-forge
alembic                   1.8.1              pyhd8ed1ab_0    conda-forge
antlr-python-runtime      4.11.1             pyhd8ed1ab_0    conda-forge
antlr4-python3-runtime    4.11.1             pyh1a96a4e_0    conda-forge
anyio                     3.6.2              pyhd8ed1ab_0    conda-forge
appdirs                   1.4.4              pyh9f0ad1d_0    conda-forge
arrow-cpp                 9.0.0           py39hd3ccb9b_2_cpu    conda-forge
attrs                     22.1.0             pyh71513ae_1    conda-forge
aws-c-cal                 0.5.11               h95a6274_0    conda-forge
aws-c-common              0.6.2                h7f98852_0    conda-forge
aws-c-event-stream        0.2.7               h3541f99_13    conda-forge
aws-c-io                  0.10.5               hfb6a706_0    conda-forge
aws-checksums             0.1.11               ha31a3da_7    conda-forge
aws-sdk-cpp               1.8.186              hecaee15_4    conda-forge
babel                     2.11.0             pyhd8ed1ab_0    conda-forge
backports                 1.0                pyhd8ed1ab_3    conda-forge
backports.functools_lru_cache 1.6.4              pyhd8ed1ab_0    conda-forge
bcrypt                    3.2.2            py39hb9d737c_1    conda-forge
blinker                   1.5                pyhd8ed1ab_0    conda-forge
bokeh                     2.4.3              pyhd8ed1ab_3    conda-forge
brotli                    1.0.9                h166bdaf_8    conda-forge
brotli-bin                1.0.9                h166bdaf_8    conda-forge
brotlipy                  0.7.0           py39hb9d737c_1005    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
c-ares                    1.18.1               h7f98852_0    conda-forge
ca-certificates           2022.9.24            ha878542_0    conda-forge
cachetools                5.2.0              pyhd8ed1ab_0    conda-forge
certifi                   2022.9.24          pyhd8ed1ab_0    conda-forge
cffi                      1.15.1           py39he91dace_2    conda-forge
cfgv                      3.3.1              pyhd8ed1ab_0    conda-forge
charset-normalizer        2.1.1              pyhd8ed1ab_0    conda-forge
ciso8601                  2.2.0            py39hb9d737c_4    conda-forge
click                     8.1.3           unix_pyhd8ed1ab_2    conda-forge
cloudpickle               2.2.0              pyhd8ed1ab_0    conda-forge
colorama                  0.4.6              pyhd8ed1ab_0    conda-forge
configparser              5.3.0              pyhd8ed1ab_0    conda-forge
contourpy                 1.0.6            py39hf939315_0    conda-forge
coverage                  6.5.0            py39hb9d737c_1    conda-forge
cryptography              38.0.4           py39hd97740a_0    conda-forge
cubinlinker               0.2.0            py39h11215e4_1    rapidsai
cuda-cccl                 11.8.89                       0    nvidia
cuda-command-line-tools   11.8.0                        0    nvidia
cuda-compiler             11.8.0                        0    nvidia
cuda-cudart               11.8.89                       0    nvidia
cuda-cudart-dev           11.8.89                       0    nvidia
cuda-cuobjdump            11.8.86                       0    nvidia
cuda-cupti                11.8.87                       0    nvidia
cuda-cuxxfilt             11.8.86                       0    nvidia
cuda-documentation        11.8.86                       0    nvidia
cuda-driver-dev           11.8.89                       0    nvidia
cuda-gdb                  11.8.86                       0    nvidia
cuda-libraries            11.8.0                        0    nvidia
cuda-libraries-dev        11.8.0                        0    nvidia
cuda-memcheck             11.8.86                       0    nvidia
cuda-nsight               11.8.86                       0    nvidia
cuda-nsight-compute       11.8.0                        0    nvidia
cuda-nvcc                 11.8.89                       0    nvidia
cuda-nvdisasm             11.8.86                       0    nvidia
cuda-nvml-dev             11.8.86                       0    nvidia
cuda-nvprof               11.8.87                       0    nvidia
cuda-nvprune              11.8.86                       0    nvidia
cuda-nvrtc                11.8.89                       0    nvidia
cuda-nvrtc-dev            11.8.89                       0    nvidia
cuda-nvtx                 11.8.86                       0    nvidia
cuda-nvvp                 11.8.87                       0    nvidia
cuda-profiler-api         11.8.86                       0    nvidia
cuda-python               11.8.0           py39h3fd9d12_0    nvidia
cuda-sanitizer-api        11.8.86                       0    nvidia
cuda-toolkit              11.8.0                        0    nvidia
cuda-tools                11.8.0                        0    nvidia
cuda-visual-tools         11.8.0                        0    nvidia
cudatoolkit               11.5.1               hcf5317a_9    nvidia
cudf                      22.12.00a221130 cuda_11_py39_geb271044c2_307    rapidsai-nightly
cuml                      22.12.00a221130 cuda11_py39_gb962396dc_51    rapidsai-nightly
cupy                      11.3.0           py39hc3c280e_1    conda-forge
cycler                    0.11.0             pyhd8ed1ab_0    conda-forge
cytoolz                   0.12.0           py39hb9d737c_1    conda-forge
dask                      2022.11.1          pyhd8ed1ab_0    conda-forge
dask-core                 2022.11.1          pyhd8ed1ab_0    conda-forge
dask-cuda                 22.12.00a221130 py39_g55375b8_33    rapidsai-nightly
dask-cudf                 22.12.00a221130 cuda_11_py39_geb271044c2_307    rapidsai-nightly
dask-sql                  2022.8.0+99.g73366d6.dirty          pypi_0    pypi
databricks-cli            0.17.3             pyhd8ed1ab_0    conda-forge
deap                      1.3.3            py39h4661b88_1    conda-forge
distlib                   0.3.6              pyhd8ed1ab_0    conda-forge
distributed               2022.11.1          pyhd8ed1ab_0    conda-forge
dlpack                    0.5                  h9c3ff4c_0    conda-forge
docker-py                 6.0.0              pyhd8ed1ab_0    conda-forge
docutils                  0.19             py39hf3d152e_1    conda-forge
entrypoints               0.4                pyhd8ed1ab_0    conda-forge
exceptiongroup            1.0.4              pyhd8ed1ab_0    conda-forge
execnet                   1.9.0              pyhd8ed1ab_0    conda-forge
faiss-proc                1.0.0                      cuda    rapidsai
fastapi                   0.88.0             pyhd8ed1ab_0    conda-forge
fastavro                  1.7.0            py39hb9d737c_0    conda-forge
fastrlock                 0.8              py39h5a03fae_3    conda-forge
filelock                  3.8.0              pyhd8ed1ab_0    conda-forge
flask                     2.2.2              pyhd8ed1ab_0    conda-forge
fonttools                 4.38.0           py39hb9d737c_1    conda-forge
freetype                  2.12.1               hca18f0e_1    conda-forge
fs                        2.4.15             pyhd8ed1ab_0    conda-forge
fsspec                    2022.11.0          pyhd8ed1ab_0    conda-forge
fugue                     0.7.3              pyhd8ed1ab_0    conda-forge
fugue-sql-antlr           0.1.1              pyhd8ed1ab_0    conda-forge
future                    0.18.2             pyhd8ed1ab_6    conda-forge
gds-tools                 1.4.0.31                      0    nvidia
gflags                    2.2.2             he1b5a44_1004    conda-forge
gitdb                     4.0.10             pyhd8ed1ab_0    conda-forge
gitpython                 3.1.29             pyhd8ed1ab_0    conda-forge
glog                      0.6.0                h6f12383_0    conda-forge
greenlet                  2.0.1            py39h5a03fae_0    conda-forge
grpc-cpp                  1.47.1               hbad87ad_6    conda-forge
gunicorn                  20.1.0           py39hf3d152e_3    conda-forge
h11                       0.14.0             pyhd8ed1ab_0    conda-forge
heapdict                  1.0.1                      py_0    conda-forge
identify                  2.5.9              pyhd8ed1ab_0    conda-forge
idna                      3.4                pyhd8ed1ab_0    conda-forge
imagesize                 1.4.1              pyhd8ed1ab_0    conda-forge
importlib-metadata        5.1.0              pyha770c72_0    conda-forge
importlib_resources       5.10.0             pyhd8ed1ab_0    conda-forge
iniconfig                 1.1.1              pyh9f0ad1d_0    conda-forge
intake                    0.6.6              pyhd8ed1ab_0    conda-forge
itsdangerous              2.1.2              pyhd8ed1ab_0    conda-forge
jinja2                    3.1.2              pyhd8ed1ab_1    conda-forge
joblib                    1.2.0              pyhd8ed1ab_0    conda-forge
jpeg                      9e                   h166bdaf_2    conda-forge
jsonschema                4.17.3             pyhd8ed1ab_0    conda-forge
keyutils                  1.6.1                h166bdaf_0    conda-forge
kiwisolver                1.4.4            py39hf939315_1    conda-forge
krb5                      1.19.3               h3790be6_0    conda-forge
lcms2                     2.14                 h6ed2654_0    conda-forge
ld_impl_linux-64          2.39                 hcc3a1bd_1    conda-forge
lerc                      4.0.0                h27087fc_0    conda-forge
libabseil                 20220623.0      cxx17_h48a1fff_5    conda-forge
libblas                   3.9.0           16_linux64_openblas    conda-forge
libbrotlicommon           1.0.9                h166bdaf_8    conda-forge
libbrotlidec              1.0.9                h166bdaf_8    conda-forge
libbrotlienc              1.0.9                h166bdaf_8    conda-forge
libcblas                  3.9.0           16_linux64_openblas    conda-forge
libcrc32c                 1.1.2                h9c3ff4c_0    conda-forge
libcublas                 11.11.3.6                     0    nvidia
libcublas-dev             11.11.3.6                     0    nvidia
libcudf                   22.12.00a221130 cuda11_geb271044c2_307    rapidsai-nightly
libcufft                  10.9.0.58                     0    nvidia
libcufft-dev              10.9.0.58                     0    nvidia
libcufile                 1.4.0.31                      0    nvidia
libcufile-dev             1.4.0.31                      0    nvidia
libcuml                   22.12.00a221130 cuda11_gb962396dc_51    rapidsai-nightly
libcumlprims              22.12.00a221010 cuda11_geaadb5e_2    rapidsai-nightly
libcurand                 10.3.0.86                     0    nvidia
libcurand-dev             10.3.0.86                     0    nvidia
libcurl                   7.86.0               h7bff187_1    conda-forge
libcusolver               11.4.1.48                     0    nvidia
libcusolver-dev           11.4.1.48                     0    nvidia
libcusparse               11.7.5.86                     0    nvidia
libcusparse-dev           11.7.5.86                     0    nvidia
libdeflate                1.14                 h166bdaf_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libevent                  2.1.10               h9b69904_4    conda-forge
libfaiss                  1.7.0           cuda112h5bea7ad_8_cuda    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 12.2.0              h65d4601_19    conda-forge
libgfortran-ng            12.2.0              h69a702a_19    conda-forge
libgfortran5              12.2.0              h337968e_19    conda-forge
libgomp                   12.2.0              h65d4601_19    conda-forge
libgoogle-cloud           2.1.0                h9ebe8e8_2    conda-forge
liblapack                 3.9.0           16_linux64_openblas    conda-forge
libllvm11                 11.1.0               he0ac6c6_5    conda-forge
libnghttp2                1.47.0               hdcd2b5c_1    conda-forge
libnpp                    11.8.0.86                     0    nvidia
libnpp-dev                11.8.0.86                     0    nvidia
libnsl                    2.0.0                h7f98852_0    conda-forge
libnvjpeg                 11.9.0.86                     0    nvidia
libnvjpeg-dev             11.9.0.86                     0    nvidia
libopenblas               0.3.21          pthreads_h78a6416_3    conda-forge
libpng                    1.6.39               h753d276_0    conda-forge
libpq                     14.5                 hd77ab85_1    conda-forge
libprotobuf               3.20.2               h6239696_0    conda-forge
libraft-distance          22.12.00a221130 cuda11_g11c5105_136    rapidsai-nightly
libraft-headers           22.12.00a221130 cuda11_g11c5105_136    rapidsai-nightly
libraft-nn                22.12.00a221130 cuda11_g11c5105_136    rapidsai-nightly
librmm                    22.12.00a221130 cuda11_gda7036aa_57    rapidsai-nightly
libsodium                 1.0.18               h36c2ea0_1    conda-forge
libsqlite                 3.40.0               h753d276_0    conda-forge
libssh2                   1.10.0               haa6b8db_3    conda-forge
libstdcxx-ng              12.2.0              h46fd767_19    conda-forge
libthrift                 0.16.0               h491838f_2    conda-forge
libtiff                   4.4.0                h55922b4_4    conda-forge
libutf8proc               2.8.0                h166bdaf_0    conda-forge
libuuid                   2.32.1            h7f98852_1000    conda-forge
libwebp-base              1.2.4                h166bdaf_0    conda-forge
libxcb                    1.13              h7f98852_1004    conda-forge
libxgboost                1.6.2dev.rapidsai22.12       cuda_11_0    rapidsai-nightly
libzlib                   1.2.13               h166bdaf_4    conda-forge
lightgbm                  3.3.3            py39h5a03fae_1    conda-forge
llvmlite                  0.39.1           py39h7d9a04d_1    conda-forge
locket                    1.0.0              pyhd8ed1ab_0    conda-forge
lz4                       4.0.2            py39h029007f_0    conda-forge
lz4-c                     1.9.3                h9c3ff4c_1    conda-forge
mako                      1.2.4              pyhd8ed1ab_0    conda-forge
markdown                  3.4.1              pyhd8ed1ab_0    conda-forge
markupsafe                2.1.1            py39hb9d737c_2    conda-forge
matplotlib-base           3.6.2            py39hf9fd14e_0    conda-forge
maturin                   0.14.2           py39h4ef89ea_0    conda-forge
mlflow                    2.0.1            py39ha39b057_1    conda-forge
mock                      4.0.3              pyhd8ed1ab_4    conda-forge
msgpack-python            1.0.4            py39hf939315_1    conda-forge
munkres                   1.1.4              pyh9f0ad1d_0    conda-forge
nccl                      2.14.3.1             h0800d71_0    conda-forge
ncurses                   6.3                  h27087fc_1    conda-forge
nest-asyncio              1.5.6              pyhd8ed1ab_0    conda-forge
nodeenv                   1.7.0              pyhd8ed1ab_0    conda-forge
nsight-compute            2022.3.0.22                   0    nvidia
numba                     0.56.4           py39h61ddf18_0    conda-forge
numpy                     1.23.5           py39h3d75532_0    conda-forge
nvtx                      0.2.3            py39hb9d737c_2    conda-forge
oauthlib                  3.2.2              pyhd8ed1ab_0    conda-forge
openjpeg                  2.5.0                h7d73246_1    conda-forge
openssl                   1.1.1s               h166bdaf_0    conda-forge
orc                       1.7.6                h6c59b99_0    conda-forge
packaging                 21.3               pyhd8ed1ab_0    conda-forge
pandas                    1.5.2            py39h4661b88_0    conda-forge
paramiko                  2.12.0             pyhd8ed1ab_0    conda-forge
parquet-cpp               1.5.1                         2    conda-forge
partd                     1.3.0              pyhd8ed1ab_0    conda-forge
pillow                    9.2.0            py39hf3a2cdf_3    conda-forge
pip                       22.3.1             pyhd8ed1ab_0    conda-forge
pkgutil-resolve-name      1.3.10             pyhd8ed1ab_0    conda-forge
platformdirs              2.5.2              pyhd8ed1ab_1    conda-forge
pluggy                    1.0.0              pyhd8ed1ab_5    conda-forge
pre-commit                2.20.0           py39hf3d152e_1    conda-forge
prometheus_client         0.15.0             pyhd8ed1ab_0    conda-forge
prometheus_flask_exporter 0.21.0             pyhd8ed1ab_0    conda-forge
prompt-toolkit            3.0.33             pyha770c72_0    conda-forge
prompt_toolkit            3.0.33               hd8ed1ab_0    conda-forge
protobuf                  3.20.2           py39h5a03fae_1    conda-forge
psutil                    5.9.4            py39hb9d737c_0    conda-forge
psycopg2                  2.9.3            py39hb9d737c_1    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
ptxcompiler               0.7.0            py39h1eff087_2    conda-forge
pure-sasl                 0.6.2              pyhd8ed1ab_0    conda-forge
py                        1.11.0             pyh6c4a22f_0    conda-forge
py-xgboost                1.6.2dev.rapidsai22.12  cuda_11_py39_0    rapidsai-nightly
pyarrow                   9.0.0           py39hc0775d8_2_cpu    conda-forge
pycparser                 2.21               pyhd8ed1ab_0    conda-forge
pydantic                  1.10.2           py39hb9d737c_1    conda-forge
pygments                  2.13.0             pyhd8ed1ab_0    conda-forge
pyhive                    0.6.5              pyhd8ed1ab_0    conda-forge
pyjwt                     2.6.0              pyhd8ed1ab_0    conda-forge
pylibraft                 22.12.00a221130 cuda11_py39_g11c5105_136    rapidsai-nightly
pynacl                    1.5.0            py39hb9d737c_2    conda-forge
pynvml                    11.4.1             pyhd8ed1ab_0    conda-forge
pyopenssl                 22.1.0             pyhd8ed1ab_0    conda-forge
pyparsing                 3.0.9              pyhd8ed1ab_0    conda-forge
pyrsistent                0.19.2           py39hb9d737c_0    conda-forge
pysocks                   1.7.1              pyha2e5f31_6    conda-forge
pytest                    7.2.0              pyhd8ed1ab_2    conda-forge
pytest-cov                4.0.0              pyhd8ed1ab_0    conda-forge
pytest-xdist              3.0.2              pyhd8ed1ab_0    conda-forge
python                    3.9.15          h47a2c10_0_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python-tzdata             2022.6             pyhd8ed1ab_0    conda-forge
python_abi                3.9                      3_cp39    conda-forge
pytz                      2022.6             pyhd8ed1ab_0    conda-forge
pytz-deprecation-shim     0.1.0.post0      py39hf3d152e_3    conda-forge
pywin32-on-windows        0.1.0              pyh1179c8e_3    conda-forge
pyyaml                    6.0              py39hb9d737c_5    conda-forge
qpd                       0.3.3              pyhd8ed1ab_0    conda-forge
querystring_parser        1.2.4                      py_0    conda-forge
raft-dask                 22.12.00a221130 cuda11_py39_g11c5105_136    rapidsai-nightly
re2                       2022.06.01           h27087fc_1    conda-forge
readline                  8.1.2                h0f457ee_0    conda-forge
requests                  2.28.1             pyhd8ed1ab_1    conda-forge
rmm                       22.12.00a221130 cuda11_py39_gda7036aa_57    rapidsai-nightly
s2n                       1.0.10               h9b69904_0    conda-forge
scikit-learn              1.1.3            py39hd5c8da3_1    conda-forge
scipy                     1.9.3            py39hddc5342_2    conda-forge
semantic_version          2.10.0             pyhd8ed1ab_0    conda-forge
setuptools                65.5.1             pyhd8ed1ab_0    conda-forge
setuptools-rust           1.5.2              pyhd8ed1ab_0    conda-forge
shap                      0.41.0           py39h1832856_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
slicer                    0.0.7              pyhd8ed1ab_0    conda-forge
smmap                     3.0.5              pyh44b312d_0    conda-forge
snappy                    1.1.9                hbd366e4_2    conda-forge
sniffio                   1.3.0              pyhd8ed1ab_0    conda-forge
snowballstemmer           2.2.0              pyhd8ed1ab_0    conda-forge
sortedcontainers          2.4.0              pyhd8ed1ab_0    conda-forge
spdlog                    1.8.5                h4bd325d_1    conda-forge
sphinx                    5.3.0              pyhd8ed1ab_0    conda-forge
sphinxcontrib-applehelp   1.0.2                      py_0    conda-forge
sphinxcontrib-devhelp     1.0.2                      py_0    conda-forge
sphinxcontrib-htmlhelp    2.0.0              pyhd8ed1ab_0    conda-forge
sphinxcontrib-jsmath      1.0.1                      py_0    conda-forge
sphinxcontrib-qthelp      1.0.3                      py_0    conda-forge
sphinxcontrib-serializinghtml 1.1.5              pyhd8ed1ab_2    conda-forge
sqlalchemy                1.4.44           py39hb9d737c_0    conda-forge
sqlparse                  0.4.3              pyhd8ed1ab_0    conda-forge
starlette                 0.22.0             pyhd8ed1ab_0    conda-forge
stopit                    1.1.2                      py_0    conda-forge
tabulate                  0.9.0              pyhd8ed1ab_1    conda-forge
tblib                     1.7.0              pyhd8ed1ab_0    conda-forge
threadpoolctl             3.1.0              pyh8a188c0_0    conda-forge
thrift                    0.17.0           py39h5a03fae_0    conda-forge
thrift_sasl               0.4.3              pyhd8ed1ab_2    conda-forge
tk                        8.6.12               h27826a3_0    conda-forge
toml                      0.10.2             pyhd8ed1ab_0    conda-forge
tomli                     2.0.1              pyhd8ed1ab_0    conda-forge
toolz                     0.12.0             pyhd8ed1ab_0    conda-forge
tornado                   6.1              py39hb9d737c_3    conda-forge
tpot                      0.11.7             pyhd8ed1ab_1    conda-forge
tqdm                      4.64.1             pyhd8ed1ab_0    conda-forge
treelite                  3.0.0            py39hc7ff369_1    conda-forge
treelite-runtime          3.0.0                    pypi_0    pypi
triad                     0.7.0              pyhd8ed1ab_0    conda-forge
typing-extensions         4.4.0                hd8ed1ab_0    conda-forge
typing_extensions         4.4.0              pyha770c72_0    conda-forge
tzdata                    2022g                h191b570_0    conda-forge
tzlocal                   4.2              py39hf3d152e_2    conda-forge
ucx                       1.13.1               h538f049_0    conda-forge
ucx-proc                  1.0.0                       gpu    rapidsai
ucx-py                    0.29.00a221129  py39_g707b335_22    rapidsai-nightly
ukkonen                   1.0.1            py39hf939315_3    conda-forge
unicodedata2              15.0.0           py39hb9d737c_0    conda-forge
update_checker            0.18.0             pyh9f0ad1d_0    conda-forge
urllib3                   1.26.13            pyhd8ed1ab_0    conda-forge
uvicorn                   0.20.0           py39hf3d152e_1    conda-forge
virtualenv                20.17.0          py39hf3d152e_0    conda-forge
wcwidth                   0.2.5              pyh9f0ad1d_2    conda-forge
websocket-client          1.4.2              pyhd8ed1ab_0    conda-forge
werkzeug                  2.2.2              pyhd8ed1ab_0    conda-forge
wheel                     0.38.4             pyhd8ed1ab_0    conda-forge
xgboost                   1.6.2dev.rapidsai22.12  cuda_11_py39_0    rapidsai-nightly
xorg-libxau               1.0.9                h7f98852_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
yaml                      0.2.5                h7f98852_2    conda-forge
zict                      2.2.0              pyhd8ed1ab_0    conda-forge
zipp                      3.11.0             pyhd8ed1ab_0    conda-forge
zlib                      1.2.13               h166bdaf_4    conda-forge
zstd                      1.5.2                h6239696_4    conda-forge

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I still don't seem to be able to reproduce; the only difference in my environment is that it's missing _py-xgboost-mutex, but even installing it manually doesn't seem to impact anything.

Why don't we pull in the latest changes to see if these failures crop up in gpuCI

@ayushdg
Copy link
Collaborator

ayushdg commented Nov 30, 2022

@ChrisJar are the changes in dask/dask#9563 only needed to enable this functionality on gpu's or is this needed for this functionality in general?

@ChrisJar
Copy link
Collaborator Author

Those changes are actually not needed for this PR at all. I ended up taking a different path.

for operand in operands:
if is_frame(operand):
# Check if frame evaluates to nan or NA
if len(operand) == 1 and not operand.isnull().all().compute():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what case could we have an operand where the len is 1 but it's also a interpreted as a frame?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comes up when we pass through aggregations as input; if we attempt something like coalesce(sum(b), 3), sum(b) will be passed through here as a len 1 series representing a "scalar", which we must somehow distinguish from a series that would otherwise represent a standard column.

There's some additional context in #823 (comment), which I ended up resolving with 2000932

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. It isn't ideal but I don't think there's a better way to accomplish this today

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants