-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] strings::from_timestamp can overflow on valid timestamps #9790
Comments
revans2
added
bug
Something isn't working
Needs Triage
Need team to review and classify
labels
Nov 29, 2021
diff --git a/cpp/src/strings/convert/convert_datetime.cu b/cpp/src/strings/convert/convert_datetime.cu
index 51a6a796ba..8d0c5704a7 100644
--- a/cpp/src/strings/convert/convert_datetime.cu
+++ b/cpp/src/strings/convert/convert_datetime.cu
@@ -707,9 +707,9 @@ struct from_timestamp_base {
* scale( 61,60) -> 1
* @endcode
*/
- __device__ int32_t scale_time(int64_t time, int64_t base) const
+ __device__ int64_t scale_time(int64_t time, int64_t base) const
{
- return static_cast<int32_t>((time - ((time < 0) * (base - 1L))) / base);
+ return (time - ((time < 0) * (base - 1L))) / base;
};
__device__ time_components get_time_components(int64_t tstamp) const Appears to fix the issue. I'll try to turn it into a PR with some tests. |
@davidwendt sorry feel free to take this if you want. |
@revans2 This change you proposed here looks good. I thought it might be a more involved change but this looks correct. Feel free to post a PR with this change. |
davidwendt
added
libcudf
Affects libcudf (C++/CUDA) code.
strings
strings issues (C++ and Python)
labels
Nov 29, 2021
Will do |
rapids-bot bot
pushed a commit
that referenced
this issue
Dec 1, 2021
This fixes #9790 When converting a timestamp to a String it is possible for the %M min calculation to overflow an int32_t part way through casting. This moves that result to be an int64_t which avoids the overflow issues. Authors: - Robert (Bobby) Evans (https://github.com/revans2) Approvers: - Bradley Dice (https://github.com/bdice) - David Wendt (https://github.com/davidwendt) URL: #9793
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
When I try to convert some microsecond values to strings.
scale_time
appears to overflow and I get the wrong answer out.Steps/Code to reproduce bug
This is Spark code, but I will try to get a repro case in C++ shortly.
Essentially when we get to
128849018880000000
which is the micro second timestamp value.get_time_components
is called to find the hour, min, sec, and sub-sec parts of the timestamp. If it is micro-seconds, then the micro-second part is calculate and the timestamp is converted to seconds.128849018880000000L / 1000000
=>21474836480
Then
scale_time(tstamp, 60)
is called, to convert the timestamp into mins instead of seconds. But21474836480/60
is2147483648
, and scale_time returns anint32_t
, so it overflows (Int.MaxValue is 2147483647).Expected behavior
We get the correct answer back.
The text was updated successfully, but these errors were encountered: