-
Notifications
You must be signed in to change notification settings - Fork 933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Fixed-point Decimal type support #3556
Comments
Summarising some of the design choices that were made offline with @jrhemstad and @harrism:
Updated class might look something like: using scale_type = int32_t;
// Rep = representative type
template <typename Rep, typename Radix>
class scaled_integer {
template <typename T,
typename std::enable_if_t<std::is_same<T, float>::value>* = nullptr>
scaled_integer(T value, scale_type scale) {
// implementation for int
}
template <typename T,
typename std::enable_if_t<std::is_same<T, int>::value>* = nullptr>
scaled_integer(T value, scale_type scale) {
// implementation for float
}
}; and then: using decimal32 = scaled_integer<int32_t, 10>; |
Do you need the SFINAE shown here, or can you just specialize the template constructors (since the specifializations are for specific single types and they are not partial specializations)? |
Yeah, if you're only interested in scaled_integer(int value, scale_type scale) {
// implementation for int
}
scaled_integer(float value, scale_type scale) {
// implementation for float
} That would require 6 overloads (int8,16,32,64,float32,64). I'd do SFINAE on traits like |
This PR resolves a part of #3556. Supporting `cudf::reduce`: 1. Part 1 (`MIN`, `MAX`, `SUM` & `PRODUCT` & `NUNIQUE`) #6814 2. Part 2 (the rest)◀️ **Reduction Ops:** **Done in Previous PR** ✔️ `SUM, ///< sum reduction` ✔️ `PRODUCT, ///< product reduction` ✔️ `MIN, ///< min reduction` ✔️ `MAX, ///< max reduction` ✔️ `NUNIQUE, ///< count number of unique elements` **Not supported by `cudf::reduce`:** * [x] `COUNT_VALID, ///< count number of valid elements` * [x] `COUNT_ALL, ///< count number of elements` * [x] `COLLECT, ///< collect values into a list` * [x] `LEAD, ///< window function, accesses row at specified offset following current row` * [x] `LAG, ///< window function, accesses row at specified offset preceding current row` * [x] `PTX, ///< PTX UDF based reduction` * [x] `CUDA ///< CUDA UDf based reduction` * [x] `ARGMAX, ///< Index of max element` * [x] `ARGMIN, ///< Index of min element` * [x] `ROW_NUMBER, ///< get row-number of element` **Won't be supported:** * [x] `ANY, ///< any reduction` * [x] `ALL, ///< all reduction` **To Do / Investigate:** * [x] `SUM_OF_SQUARES, ///< sum of squares reduction` * [x] `MEDIAN, ///< median reduction` * [x] `QUANTILE, ///< compute specified quantile(s)` * [x] `NTH_ELEMENT, ///< get the nth element` **Deferred until requested** * [x] `MEAN, ///< arithmetic mean reduction` * [x] `VARIANCE, ///< groupwise variance` * [x] `STD, ///< groupwise standard deviation` Authors: - Conor Hoekstra <[email protected]> Approvers: - null - Karthikeyan - David URL: #6980
This PR resolves a part of #3556. Aggregation ops supported: * `MIN` * `MAX` * `COUNT` (both `null_policy` - `EX/INCLUDE`) * `LEAD` * `LAG` **To Do List:** * [x] Basic unit tests * [x] Comprehensive unit tests * [x] Implementation * [x] Figure out which rolling ops to suppport Authors: - Conor Hoekstra <[email protected]> Approvers: - Vukasin Milovanovic - Ram (Ramakrishna Prabhu) URL: #7037
Adding support for `cudf::scan` for `decimal32` and `decimal64`. `cudf::scan` only supports 4 operations (sum, product, min and max) but the decimal types will only support `SUM`, `MAX` and `MIN`. This PR resolves a part of #3556. Authors: - Conor Hoekstra <[email protected]> Approvers: - Jake Hemstad - Mark Harris URL: #7063
) This PR resolves a part of #3556. I decided to push the changes for sort `cudf::group_by` and hash `group_by` in different PRs. Authors: - Conor Hoekstra (@codereport) Approvers: - Ram (Ramakrishna Prabhu) (@rgsl888prabhu) - Karthikeyan (@karthikeyann) URL: #7169
) Follow up PR to #7169 This PR resolves a part of #3556. Authors: - Conor Hoekstra (@codereport) Approvers: - David (@davidwendt) - Jake Hemstad (@jrhemstad) - Devavret Makkar (@devavret) URL: #7190
Looks like the only thing missing here is the |
Closing as I've opened #8161 for tracking |
Fixes #9597 Fixes #9565 Previously, `fixed_point` along with `decimal32` and `decimal64` were added to support DecimalType (see #3556 for a list of major and minor PRs). With [support for `__int128_t` now in CUDA 11.5](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cuda-general-new-features), we can support `decimal128.` This PR enables `decimal128`. Authors: - Conor Hoekstra (https://github.com/codereport) Approvers: - Robert (Bobby) Evans (https://github.com/revans2) - Mark Harris (https://github.com/harrism) - AJ Schmidt (https://github.com/ajschmidt8) - Jake Hemstad (https://github.com/jrhemstad)
Is your feature request related to a problem? Please describe.
cuDF should support fixed-point decimal types.
Describe the solution you'd like
Add a new
DECIMAL
type forcolumn
s.The scale information will be stored per-column in the
data_type
.Also requires a new
scaled_integer
type for encapsulating an integral value and a scale factor that provides arithmetic operators to allow operating on the fixed point values. E.g.,While Arrow only supports 128 bit fixed-point decimals, this design will allow us to have 32bit, 64bit, etc.
Describe alternatives you've considered
The
scaled_integer
type from CNL won't work because it requires the scale information to be compile time info.Additional context
In order to support Decimal types with 128 bits of precision (like Arrow), we'll also need a 128 bit integer type.
List of PR Breakdowns:
Major PRS:
fixed_point
type ([REVIEW] Addfixed_point
class to support DecimalType #3782)fixed_point
Column Support ([REVIEW] Initialfixed_point
Column Support #5704)sort
/gather
scatter
replace
upper_bound
/lower_bound
interleave
concatenate
decimal32
anddecimal64
to relevant type listsfixed_point::scale
todata_type
([REVIEW]fixed_point
Column Optimization (storescale
indata_type
) [skip ci] #5861)scale
intodata_type
fixed_point
type (i.e sort)Minor PRs:
FixedWidthTypes
(instead ofFixedWidthTypesWithoutFixedPoint
) ([REVIEW] Enable morefixed_point
unit tests by introducing "scale-less" constructor [skip ci] #5817)fixed_point
binary operations ([REVIEW] Enablefixed_point
binary operations #6528)cudf::binary_operation
NULL_MIN
,NULL_MAX
&NULL_EQUALS
fordecimal32
anddecimal64
(Addcudf::binary_operation
NULL_MIN
,NULL_MAX
&NULL_EQUALS
fordecimal32
anddecimal64
#7119)cudf::round
decimal32
&decimal64
(HALF_UP
andHALF_EVEN
) ([REVIEW] Implementcudf::round
decimal32
&decimal64
(HALF_UP
andHALF_EVEN
) #6685)cudf::cast
fordecimal32/64
to/from integer and floating point (Implementcudf::cast
fordecimal32/64
to/from integer and floating point #6711)cudf::cast
fordecimal32/64
to/from different scale (Implementcudf::cast
fordecimal32/64
to/from differenttype_id
#6729)cudf::unary_operation
fordecimal32
&decimal64
#6777)TRUE_DIV
(Add support forcudf::binary_operation
TRUE_DIV
fordecimal32
anddecimal64
#7198)(P)MOD
(Support forMOD
,PMOD
andPYMOD
fordecimal32/64/128
#10179)cudf::reduce
:MIN
,MAX
,SUM
,PRODUCT
,NUNIQUE
(Implementcudf::reduce
fordecimal32
anddecimal64
(part 1) #6814)cudf::reduce
: others (Implementcudf::reduce
fordecimal32
anddecimal64
(part 2) #6980)cudf::scan
(cudf::scan
support fordecimal32
anddecimal64
#7063)cudf::groupby
cudf::group_by
(sort) fordecimal32
anddecimal64
(Implementcudf::group_by
(sort) fordecimal32
anddecimal64
#7169)cudf::group_by
(hash) fordecimal32
anddecimal64
(Implementcudf::group_by
(hash) fordecimal32
anddecimal64
#7190)cudf::rolling_window
(Implementcudf::rolling
fordecimal32
anddecimal64
#7037)cudf::rolling_window
ROW_NUMBER
support fordecimal32
anddecimal64
(cudf::rolling
ROW_NUMBER
support fordecimal32
anddecimal64
#7061)cudf::rolling_window
SUM
support fordecimal32
anddecimal64
(cudf::rolling_window
SUM
support fordecimal32
anddecimal64
#7147)cudf::grouped_rolling_window
cudf::copy_range
algorithm (Implementcudf::copy_range
fordecimal32
anddecimal64
#6843)cudf::clamp
algorithm (Implementcudf::clamp
fordecimal32
anddecimal64
#6792)cudf::detail::copy_if
/cudf::apply_boolean_mask
([REVIEW] Enable copy_if for fixed-point decimal columns #6805)cudf::copy_if_else
algorithm (Implementcudf::copy_if_else
fordecimal32
anddecimal64
#6845)scale
andvalue
methods tofixed_point
(Addscale
andvalue
methods tofixed_point
#7109)fixed_point_column_wrapper
constructors (Add null maskfixed_point_column_wrapper
constructors #6951)cudf::round
forfixed_point
whenscale = -decimal_places
a no-op (Makecudf::round
forfixed_point
whenscale = -decimal_places
a no-op #6975)fixed_point
with extremely largescale
s (Adding unit tests forfixed_point
with extremely largescale
s #7178)The text was updated successfully, but these errors were encountered: