[FEA] Allow to run groupby/reduction with externally derived aggregations #16633

ttnghia · 2024-08-21T21:06:44Z

This idea arose after many times trying to add new aggregations into the libcudf framework to accommodate specific use cases outside of cudf. However, most of the time, the application (Spark plugin) wants very special behaviors that cannot be accommodated. For example, for M2/MERGE_M2 aggregations, we want to output more than just one columns (the main M2 values as well as their intermediate values) for reuse somewhere else.

I would like to refactor the grouby/reduction framework such that it allows runing on aggregations extended outside of libcudf. By doing so, the downstream applications can implement any new, customized aggregations they want and call libcudf code on them. The outside aggregations just need to be implemented from classes derived from cudf base classes (cudf::groupby_aggregation for example).

Allowing extension like this would be very beneficial in the long term, allowing any downstream application to accommodate their needs and maximize performance gain. That would also help reduce maintenance efforts in the libcudf repository.

The text was updated successfully, but these errors were encountered:

davidwendt · 2024-08-21T22:24:31Z

We actually have precedence for custom UDF aggregation

cudf/cpp/include/cudf/aggregation.hpp

Lines 588 to 600 in bf2ee32

    
           /** 
        
            * @brief Factory to create an aggregation base on UDF for PTX or CUDA 
        
            * 
        
            * @param[in] type: either udf_type::PTX or udf_type::CUDA 
        
            * @param[in] user_defined_aggregator A string containing the aggregator code 
        
            * @param[in] output_type expected output type 
        
            * 
        
            * @return An aggregation containing a user-defined aggregator string 
        
            */ 
        
           template <typename Base = aggregation> 
        
           std::unique_ptr<Base> make_udf_aggregation(udf_type type, 
        
                                                      std::string const& user_defined_aggregator, 
        
                                                      data_type output_type);

Example usage in rolling here:

cudf/cpp/tests/rolling/grouped_rolling_test.cpp

Lines 207 to 208 in bf2ee32

    
           auto cuda_udf_agg = cudf::make_udf_aggregation<cudf::rolling_aggregation>( 
        
             cudf::udf_type::CUDA, cuda_func, cudf::data_type{cudf::type_id::INT64});

This implements `HOST_UDF` aggregation, allowing to execute a host-side user-defined function (UDF) through libcudf aggregation framework. * A host-side function can be an arbitrarily independent function running on the host machine. It may or may not call other device kernels depending on its implementation. * Such user-defined function must follow the libcudf provided interface (`cudf::host_udf_base`). The interface provides the ability to fully interact with libcudf aggregation framework. * Since it is implemented on the user application side, it has a very high degree of freedom to perform arbitrary operations to satisfy the user's need. Partially contributes to #16633. --- Usage 1. Define a functor deriving from `cudf::host_udf_base` and implement the required virtual functions declared in that base struct. For example: ``` struct my_aggregation : cudf::host_udf_base { ... }; ``` 2. Create an instance of libcudf `HOST_UDF` aggregation which is constructed from an instance of the functor defined above. For example: ``` auto agg = cudf::make_host_udf_aggregation<cudf::groupby_aggregation>( std::make_unique<my_aggregation>()); ``` 3. Perform aggregation operation on the created instance. Authors: - Nghia Truong (https://github.com/ttnghia) Approvers: - Yunsong Wang (https://github.com/PointKernel) - Chong Gao (https://github.com/res-life) - Vyas Ramasubramani (https://github.com/vyasr) - David Wendt (https://github.com/davidwendt) URL: #17592

ttnghia added the feature request New feature or request label Aug 21, 2024

ttnghia mentioned this issue Aug 21, 2024

[FEA] Add min_by aggregate support #16139

Closed

thirtiseven mentioned this issue Aug 30, 2024

[FEA][Follow on] Improve performance of min_by and max_by NVIDIA/spark-rapids#11412

Open

ttnghia linked a pull request Nov 5, 2024 that will close this issue

Implement HOST_UDF aggregation for reduction and groupby #17249

Draft

ttnghia mentioned this issue Dec 13, 2024

Implement HOST_UDF aggregation for groupby #17592

Merged

ttnghia self-assigned this Dec 13, 2024

ttnghia linked a pull request Dec 20, 2024 that will close this issue

Implement HOST_UDF aggregation for reduction and segmented reduction #17645

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Allow to run groupby/reduction with externally derived aggregations #16633

[FEA] Allow to run groupby/reduction with externally derived aggregations #16633

ttnghia commented Aug 21, 2024

davidwendt commented Aug 21, 2024

[FEA] Allow to run groupby/reduction with externally derived aggregations #16633

[FEA] Allow to run groupby/reduction with externally derived aggregations #16633

Comments

ttnghia commented Aug 21, 2024

davidwendt commented Aug 21, 2024