Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Column type for fixed_width_column_wrapper should be restricted. #16092

Closed
pmattione-nvidia opened this issue Jun 25, 2024 · 8 comments · Fixed by #16120
Closed

[BUG] Column type for fixed_width_column_wrapper should be restricted. #16092

pmattione-nvidia opened this issue Jun 25, 2024 · 8 comments · Fixed by #16120
Assignees
Labels
bug Something isn't working

Comments

@pmattione-nvidia
Copy link
Contributor

Describe the bug
fixed_width_column_wrapper doesn't restrict the types allowed for the column. This can lead to unexpected behavior when a user chooses to use (e.g.) decimal32 for the "ElementTo" type, and the scale information is not embedded in the type. The desired behavior is that the user uses fixed_point_column_wrapper to wrap decimal data, so we should prevent the usage of fixed_width_column_wrapper for fixed-point.

Steps/Code to reproduce bug
cudf::test::fixed_width_column_wrapper<numeric::decimal128> decimal128_col({numeric::decimal128{1.0, numeric::scale_type{-1}}});

@pmattione-nvidia pmattione-nvidia added the bug Something isn't working label Jun 25, 2024
@davidwendt
Copy link
Contributor

I believe the scale is zero if you use a fixed-point type with fixed_width_column_wrapper
What is the unexpected behavior?

@davidwendt
Copy link
Contributor

Fixed-point types are considered fixed-width

template <typename T>
constexpr inline bool is_fixed_width()
{
// TODO Add fixed width wrapper types
// Is a category fixed width?
return cudf::is_numeric<T>() || cudf::is_chrono<T>() || cudf::is_fixed_point<T>();
}

Also, there are existing tests that take advantage of this. Here are a few:

cudf::test::fixed_width_column_wrapper<numeric::decimal128> col6 = [&ones_iterator] {

cudf::test::fixed_width_column_wrapper<numeric::decimal128> col7 = [&ones] {

cudf::test::fixed_width_column_wrapper<numeric::decimal128> col6(col6_data, col6_data + num_rows);

As well as other tests that use the FixedWidthTypes type-list for building tests over all fixed-width types.

using FixedWidthTypes = Concat<NumericTypes, ChronoTypes, FixedPointTypes>;

@pmattione-nvidia
Copy link
Contributor Author

The problem is the scale being zero, even if you construct it with fixed_point data with a different scale factor. E.g. in the example I posted, you are storing a decimal with scale -1 in a column with scale zero. Then whenever you do anything with the column it behaves unexpectedly. E.g in 24.04, this comparison fails, because the scale factor is effectively lost:

cudf::test::fixed_width_column_wrapper<double> double_col({1.0});
  cudf::test::fixed_width_column_wrapper<numeric::decimal128> decimal128_col({numeric::decimal128{1.0, numeric::scale_type{-1}}});
  
  auto result = binary_operation(
    double_col,
    decimal128_col,
    cudf::binary_operator::EQUAL,
    cudf::data_type(cudf::type_id::BOOL8));

  auto expected = column_wrapper<bool>{true};
  CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, *result.get());

@pmattione-nvidia
Copy link
Contributor Author

pmattione-nvidia commented Jun 26, 2024

And it's even more dangerous, as you can have the column wrapper store a list of decimals that all have different scale factors. Then it's really not clear what the behavior even should be. The scale factor needs to solely be a property of the column, and not be allowed for the data stored within it. We should just have people use fixed_point_column_wrapper instead.

@davidwendt davidwendt self-assigned this Jun 27, 2024
@davidwendt
Copy link
Contributor

Ok, that makes sense. The convenience of the FixedWidthTypes makes supporting fixed-point types in the fixed_width_column_wrapper compelling however.

I did some investigation and found there are several places in our gtests that are actually using this incorrectly as described in this issue so it is certainly worth fixing. They currently are not failing because they are not checking the scale-type value.

I think we can continue to support this feature but only allow scale=0 since there is no way to set it in the fixed_width_column_wrapper. I'll open a PR that checks for invalid usage and fixes the incorrect ones we currently have.

@pmattione-nvidia
Copy link
Contributor Author

If someone creates a fixed_width_column_wrapper of decimal, is there a way that we can prevent a user from accidentally changing the value of the scales of the decimals stored within it?

@davidwendt
Copy link
Contributor

If someone creates a fixed_width_column_wrapper of decimal, is there a way that we can prevent a user from accidentally changing the value of the scales of the decimals stored within it?

I'm not sure I'm following. The values are copied from host to device within the wrapper ctor so changing the original host values will not change the copied device values. Maybe I need to see an example of what you are asking.

@pmattione-nvidia
Copy link
Contributor Author

Ah I see, so maybe it isn't possible then.

rapids-bot bot pushed a commit that referenced this issue Jul 11, 2024
…16120)

The `cudf::test::fixed_width_column_wrapper` supports all fixed-width type including fixed-point types. However, there is no mechanism to specify the fixed-point scale value which is common for the entire column and stored in the column's type.
This fixes the case by throwing an error if a non-zero scale is specified for the input values in a fixed-point `fixed_width_column_wrapper` instance. 

Also fixed several tests that incorrectly specified a non-zero scale. 

Closes #16092

Authors:
  - David Wendt (https://github.com/davidwendt)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Paul Mattione (https://github.com/pmattione-nvidia)

URL: #16120
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants