-
Notifications
You must be signed in to change notification settings - Fork 914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] CudaIllegalAccessException calling slice from map_partitions #4850
Comments
Click here to see environment details
|
Click here to see conda list json for last successful execution[ { "base_url": "https://conda.anaconda.org/conda-forge", "build_number": 0, "build_string": "conda_forge", "channel": "conda-forge", "dist_name": "_libgcc_mutex-0.1-conda_forge", "name": "_libgcc_mutex", "platform": "linux-64", "version": "0.1" }, { "base_url": "https://conda.anaconda.org/conda-forge", "build_number": 1, "build_string": "1_llvm", "channel": "conda-forge", "dist_name": "_openmp_mutex-4.5-1_llvm", "name": "_openmp_mutex", "platform": "linux-64", "version": "4.5" }, { "base_url": "https://conda.anaconda.org/conda-forge", "build_number": 2, "build_string": "py37h090bef1_2", "channel": "conda-forge", "dist_name": "arrow-cpp-0.15.0-py37h090bef1_2", "name": "arrow-cpp", "platform": "linux-64", "version": "0.15.0" }, { "base_url": "https://conda.anaconda.org/pypi", "build_number": 0, "build_string": "pypi_0", "channel": "pypi", "dist_name": "backcall-0.1.0-pypi_0", "name": "backcall", "platform": "pypi", "version": "0.1.0" }, { "base_url": "https://conda.anaconda.org/conda-forge", "build_number": 0, "build_string": "py37hc8dfbb8_0", "channel": "conda-forge", "dist_name": "bokeh-2.0.0-py37hc8dfbb8_0", "name": "bokeh", "platform": "linux-64", "version": "2.0.0" }, { "base_url": "https://conda.anaconda.org/conda-forge", "build_number": 2, "build_string": "h8e57a91_2", "channel": "conda-forge", "dist_name": "boost-cpp-1.70.0-h8e57a91_2", "name": "boost-cpp", "platform": "linux-64", "version": "1.70.0" }, { "base_url": "https://conda.anaconda.org/pypi", "build_number": 0, "build_string": "pypi_0", "channel": "pypi", "dist_name": "botocore-1.15.32-pypi_0", "name": "botocore", "platform": "pypi", "version": "1.15.32" }, { "base_url": "https://conda.anaconda.org/conda-forge", "build_number": 1001, "build_string": "he1b5a44_1001", "channel": "conda-forge", "dist_name": "brotli-1.0.7-he1b5a44_1001", "name": "brotli", "platform": "linux-64", "version": "1.0.7" }, { "base_url": "https://conda.anaconda.org/conda-forge", "build_number": 2, "build_string": "h516909a_2", "channel": "conda-forge", "dist_name": "bzip2-1.0.8-h516909a_2", "name": "bzip2", "platform": "linux-64", "version": "1.0.8" }, { "base_url": "https://conda.anaconda.org/conda-forge", "build_number": 1001, "build_string": "h516909a_1001", "channel": "conda-forge", "dist_name": "c-ares-1.15.0-h516909a_1001", "name": "c-ares", "platform": "linux-64", "version": "1.15.0" }, { "base_url": "https://conda.anaconda.org/conda-forge", "build_number": 0, "build_string": "hecc5488_0", "channel": "conda-forge", "dist_name": "ca-certificates-2019.11.28-hecc5488_0", "name": "ca-certificates", "platform": "linux-64", "version": "2019.11.28" }, { "base_url": "https://conda.anaconda.org/conda-forge", "build_number": 1003, "build_string": "hcf35c78_1003", "channel": "conda-forge", "dist_name": "cairo-1.16.0-hcf35c78_1003", "name": "cairo", "platform": "linux-64", "version": "1.16.0" }, { "base_url": "https://conda.anaconda.org/conda-forge", "build_number": 1, "build_string": "py37hc8dfbb8_1", "channel": "conda-forge", "dist_name": "certifi-2019.11.28-py37hc8dfbb8_1", "name": "certifi", "platform": "linux-64", "version": "2019.11.28" }, { "base_url": "https://conda.anaconda.org/conda-forge", "build_number": 2, "build_string": "hb60a0a2_2", "channel": "conda-forge", "dist_name": "cfitsio-3.470-hb60a0a2_2", "name": "cfitsio", "platform": "linux-64", "version": "3.470" }, { "base_url": "https://conda.anaconda.org/pypi", "build_number": 0, "build_string": "pypi_0", "channel": "pypi", "dist_name": "chardet-3.0.4-pypi_0", "name": "chardet", "platform": "pypi", "version": "3.0.4" |
Simple repro
|
@kevingerman All of this boils down to same issue of map_partitions, you need to provide When you don't provide meta, map_partition in search to meta information, passes random data such as "foo" to your function, which is actually not a number and at the end this fails. You would find same issue with pandas as well. For reference dask/dask#6078 and #4836
|
That behavior changed between 0.13 and 0.14. Is it is an intentional change of behavior? |
@kevingerman Earlier for any non-numeric input string cudf would return 0, but due to some recent changes for empty string as in our case which is formed due to slice would produce |
That does raise a good point for my original workflow and the importance of using meta fields to guard against bad data. However, in the original repro script every value was an 8 char string, and the slice(4,6) call should have always produced a series of all '01'. |
As it was mentioned in the dask issue, when map_partitions is called without meta, it tries to get meta information by sending a sample string to your function, and this sample string happens to be something like cat, dog, foo which are not what you provided, slice of this will result in an empty string and it fails. |
I was able to verify that the error is caused by an all-empty strings column:
If any of the those strings was not empty then the error would not occur. This is definitely a bug in libcudf. The code should be returning 0s for these and not throwing a CUDA exception. I will create a PR to fix the |
@davidwendt but we were discussing about adding |
And better than that would it be better to check for numerical rather than just integer or other types. |
That is still necessary for invalid characters. But I think an empty string (or bad characters) should not be causing a CUDA exception. You should get 0 like the other converters do. |
Describe the bug
Code is the best description. This worked in 0.13, but stopped working as of 2020/03/24 build.
Steps/Code to reproduce bug
Expected behavior
Returns a dask dataframe, not raises an exception.
Environment overview (please complete the following information)
Environment details
In comments
Additional context
The last known working conda environment listing json follows as comment
The text was updated successfully, but these errors were encountered: