You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This PR resolves#7687. It also does a bit of cleanup of the internals of the code base. There is more that I would like to do, but I'll probably punt everything to a future PR that I don't directly have to touch for this change in the interest of quickly resolving the issue.
I still need help determining why a few aggregations aren't working. The problems fall into two groups:
1. The `var` and `std` aggregations currently don't fail, but they always return columns filled with NULLs. I found the implementation of the dispatch for these methods in `variance.cu`/`compound.cuh`, and at least nominally it seems like these methods _are not_ currently supported because the corresponding `enable_if_t` is based on whether the type satisfies `std::is_arithmetic`, which decimal types will not. However, I'm not sure whether the problem is that this classification is incorrect and these types are actually supported by `libcudf`, or if there really isn't an implementation; I tried to find one, but there are a lot of different files related to aggregation and I'm sure I didn't find all of them. If we simply don't have an implementation, I can remove these from the list of valid aggregations.
2. The `mean`, `quantile`, and `median` aggregations all raise a `RuntimeError` from `binaryop.hpp`: "Input must have fixed_point data_type." I've traced the error to the Cython `GroupBy.aggregate` method, specifically the line where it calls through to the underlying `c_obj`'s `aggregate` method. The call stack in C++ is pretty deep after that, though, and I haven't yet been able to pinpoint whether the failure is a missing cast somewhere (i.e. `libcudf` thinks that the column is a floating point type when it's really not) or if the problem lies elsewhere.
**Update**
Thanks to @codereport, I've now marked all the above as unsupported operations. After some discussion with other devs I've also handled the other extended types. I still need to write tests, but I think this PR is ready for review in its current form to identify if I've missed anything in the implementation.
Authors:
- Vyas Ramasubramani (https://github.com/vyasr)
Approvers:
- Ashwin Srinath (https://github.com/shwina)
- Keith Kraus (https://github.com/kkraus14)
URL: #7731
Example:
Result: Note that the
dec_val
column is dropped entirely.The text was updated successfully, but these errors were encountered: