Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scatter-based nested-type scalar copy_if_else #1

Merged
merged 2 commits into from
Jul 9, 2021

Conversation

gerashegalov
Copy link
Owner

@gerashegalov gerashegalov commented Jul 2, 2021

Using scatter with the scalar vector. Works for lists. Structs require more work:

C++ exception with description "cuDF failure at: ../src/copying/scatter.cu:162: scatter scalar to struct_view not implemented" thrown in the test body.
[  FAILED  ] TypedCopyIfElseNestedTest/0.ScalarStructBothInvalid, where TypeParam = signed char (2 ms)

@gerashegalov gerashegalov changed the title scalar scatter [WIP] scatter-based nested-type scalar copy_if_else Jul 2, 2021
@gerashegalov gerashegalov marked this pull request as ready for review July 9, 2021 17:28
@gerashegalov gerashegalov changed the title [WIP] scatter-based nested-type scalar copy_if_else scatter-based nested-type scalar copy_if_else Jul 9, 2021
@gerashegalov gerashegalov merged commit 6d1cff9 into copy_if_else_8361 Jul 9, 2021
@gerashegalov gerashegalov deleted the copy_if_else_scalar_scatter branch July 9, 2021 17:30
gerashegalov pushed a commit that referenced this pull request Nov 28, 2023
gerashegalov pushed a commit that referenced this pull request Nov 28, 2023
Pin conda packages to `aws-sdk-cpp<1.11`. The recent upgrade in version `1.11.*` has caused several issues with cleaning up (more details on changes can be read in [this link](https://github.com/aws/aws-sdk-cpp#version-111-is-now-available)), leading to Distributed and Dask-CUDA processes to segfault. The stack for one of those crashes looks like the following:

```
(gdb) bt
#0  0x00007f5125359a0c in Aws::Utils::Logging::s_aws_logger_redirect_get_log_level(aws_logger*, unsigned int) () from /opt/conda/envs/dask/lib/python3.9/site-packages/pyarrow/../../.././libaws-cpp-sdk-core.so
#1  0x00007f5124968f83 in aws_event_loop_thread () from /opt/conda/envs/dask/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-io.so.1.0.0
#2  0x00007f5124ad9359 in thread_fn () from /opt/conda/envs/dask/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-common.so.1
rapidsai#3  0x00007f519958f6db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
rapidsai#4  0x00007f5198b1361f in clone () from /lib/x86_64-linux-gnu/libc.so.6
```

Such segfaults now manifest frequently in CI, and in some cases are reproducible with a hit rate of ~30%. Given the approaching release time, it's probably the safest option to just pin to an older version of the package while we don't pinpoint the exact cause for the issue and a patched build is released upstream.

The `aws-sdk-cpp` is statically-linked in the `pyarrow` pip package, which prevents us from using the same pinning technique. cuDF is currently pinned to `pyarrow=12.0.1` which seems to be built against `aws-sdk-cpp=1.10.*`, as per [recent build logs](https://github.com/apache/arrow/actions/runs/6276453828/job/17046177335?pr=37792#step:6:1372).

Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - Ray Douglass (https://github.com/raydouglass)

URL: rapidsai#14173
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant