Skip to content

Commit

Permalink
Fix floating point data generation in benchmarks (#10372)
Browse files Browse the repository at this point in the history
`numeric_limits::lowest` and `numeric_limits::max` are used as bounds for numeric type generation. However, for normal generators, bounds are shifted to `[0, upper_bound - lower_bound]`, and the random value is shifted back by `lower_bound`.
with `lowest` and `max`, `upper_bound - lower_bound` is out of range for floats and generated values are `nan` and `inf`.
This PR halves the ranges so that `upper_bound - lower_bound` is still within the type range.

Expected to affect benchmarks that use floating point columns (e.g. Parquet reader benchmarks).

Authors:
  - Vukasin Milovanovic (https://github.com/vuule)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - https://github.com/nvdbaranec

URL: #10372
  • Loading branch information
vuule authored Mar 7, 2022
1 parent 4f8c60a commit 7d67093
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion cpp/benchmarks/common/generate_input.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,8 @@ std::pair<int64_t, int64_t> default_range()
template <typename T, std::enable_if_t<cudf::is_numeric<T>()>* = nullptr>
std::pair<T, T> default_range()
{
return {std::numeric_limits<T>::lowest(), std::numeric_limits<T>::max()};
// Limits need to be such that `upper - lower` does not overflow
return {std::numeric_limits<T>::lowest() / 2, std::numeric_limits<T>::max() / 2};
}
} // namespace

Expand Down

0 comments on commit 7d67093

Please sign in to comment.