-
Notifications
You must be signed in to change notification settings - Fork 917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] groupby result cache error in dask-cudf test #9507
Comments
Issue is in
When When |
Fixes #9507 Prevents inserting to groupby result cache if the result for <column, aggregation> pair is already present in the cache. Added unit test to test this. **Details:** When `add_result(col1, agg1, column1); add_result(col1, agg1, column2);` is called (see twice), then _cache doesn't contain any value for {col1, agg1} anymore. Issue is in `_cache` `std::unordered_map` with `std::reference_wrapper<aggregation const>` in the key. When `_cache[{input, key}] = std::move(value);` executes 2nd time, old key is destroyed. But the key's reference never changes which points to the destroyed key. So, when compared again, `pair_column_aggregation_equal_to` fails because we are comparing a destroyed object (whose memory may have been overwritten). #9507 (comment) Authors: - Karthikeyan (https://github.com/karthikeyann) Approvers: - David Wendt (https://github.com/davidwendt) - Nghia Truong (https://github.com/ttnghia) - Bradley Dice (https://github.com/bdice) URL: #9508
Describe the bug
@bdice pointed out a recurring issue in gpuCI from dask-cudf in test_groupby_reset_index_names . I see the message:
RuntimeError: cuDF failure at: ../src/aggregation/result_cache.cpp:42: Result does not exist in cache
@karthikeyann recently edited result_cache.cpp , and we discussed this briefly earlier in the week. I don't think we were able to identify exactly what was going wrong. Are others seeing this? Any thoughts on what might be the cause? It failed on CentOS 7 both times, but with different CUDA versions/Python versions/GPUs.
Examples:
dask_cudf.tests.test_groupby.test_groupby_reset_index_names (from pytest)
dask_cudf.tests.test_groupby.test_groupby_reset_index_names (from pytest)
Error Message
RuntimeError: cuDF failure at: ../src/aggregation/result_cache.cpp:42: Result does not exist in cache
Stacktrace
Steps/Code to reproduce bug
Reproduced the error using Minimal cuDF python code.
run this code with
compute-sanitizer --tool memcheck python code.py
thanks 🙏 @bdice @jakirkham
The text was updated successfully, but these errors were encountered: