Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Simplify type-dispatch for code that only cares about the storage type #7390

Closed
jrhemstad opened this issue Feb 16, 2021 · 0 comments · Fixed by #7419
Closed

[FEA] Simplify type-dispatch for code that only cares about the storage type #7390

jrhemstad opened this issue Feb 16, 2021 · 0 comments · Fixed by #7419
Assignees
Labels
feature request New feature or request improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code.

Comments

@jrhemstad
Copy link
Contributor

jrhemstad commented Feb 16, 2021

Is your feature request related to a problem? Please describe.

There are many functions in libcudf that don't care about the actual Element type in a column (e.g., just moving data). This simplifies implementations of code paths for things like DECIMAL32/64 where we can just use the same path as INT32/64. The device_storage_type_t trait provides a convenient way to access the representation of an elements value. In type dispatched code, that looks something like:

struct dispatch{
template <typename Element>
void operator()(...){
   // Resolves to int32/64 for fixedpoint 32/64
   using storage_type = device_storage_type_t<Element>;
   
   // Do work with `storage_type`
}
};

This is pretty nice, but it would be even nicer if the type_dispatcher could automatically just dispatch the "storage" type. Then we wouldn't have to worry about adding the using storage_type... code everywhere.

Describe the solution you'd like

The type_dispatcher was designed to support exactly this kind of use case. A heretofore unused feature of the type_dispatcher is that it allows you to customize the type dispatched for a given type_id enum value. This is done via a IdTypeMap template parameter that "maps" a cudf::type_id to the type to dispatch. It is defaulted to do the expected dispatch like INT32 -> int32_t, but it can be overriden via an explicit template argument.

This makes it very easy to create a "device storage type" dispatcher:

template <cudf::type_id Id>
struct dispatch_storage_type{
   using type = device_storage_type_t< typename id_to_type_impl<Id>::type >;
};

type_dispatcher<dispatch_storage_type>(...);

This will now automatically unwrap any wrapper types.

As a side benefit, I believe using this form of the type_dispatcher could reduce compile time and binary size as it would reduce the number of unique instantiations of the function objects operator() template. This would be especially true if we made device_storage_type_t unwrap timestamp types to their underlying representation.

@jrhemstad jrhemstad added feature request New feature or request improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. labels Feb 16, 2021
@codereport codereport self-assigned this Feb 22, 2021
rapids-bot bot pushed a commit that referenced this issue Feb 23, 2021
Resolves #7390

Compile times:
```
// Before
real	33m29.842s
user	300m0.478s
sys	10m46.871s

// After
real	33m20.127s
user	299m24.825s
sys	10m35.779s
```

Binary sizes:
```
Before: -rwxr-xr-x  1 rapids rapids 328M Feb 22 15:10 libcudf_base.so
After:  -rwxr-xr-x  1 rapids rapids 327M Feb 23 07:49 libcudf_base.so
```

Authors:
  - Conor Hoekstra (@codereport)

Approvers:
  - David (@davidwendt)
  - Jake Hemstad (@jrhemstad)

URL: #7419
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants