-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assorted UBSAN cleanups #55112
Assorted UBSAN cleanups #55112
Conversation
@@ -624,9 +664,12 @@ cdef int64_t convert_reso( | |||
else: | |||
# e.g. ns -> us, risk of overflow, but no risk of lossy rounding | |||
mult = get_conversion_factor(from_reso, to_reso) | |||
with cython.overflowcheck(True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a cython bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's a bug per se. I think Cython let's the overflow happen but then adds checks after the fact to see if it overflowed. This by contrast prevents the overflow from happening in the first place. It generally gets you to the same place in the end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like Cython generates something like this:
static CYTHON_INLINE int __Pyx_mul_const_int_checking_overflow(int a, int b, int *overflow) {
if (b > 1) {
*overflow |= a > __PYX_MAX(int) / b;
*overflow |= a < __PYX_MIN(int) / b;
} else if (b == -1) {
*overflow |= a == __PYX_MIN(int);
} else if (b < -1) {
*overflow |= a > __PYX_MIN(int) / b;
*overflow |= a < __PYX_MAX(int) / b;
}
return a * b;
}
We aren't handling a negative denominator, but otherwise yea the difference is Cython still does the multiplication and just sets an overflow
variable if something overflows; we are not doing the multiplication at all in this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for my edification, this pattern is considered Better Practice than the one cython uses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the only difference here is that this will make the sanitizer happy whereas the cython approach will not
pandas/_libs/tslibs/np_datetime.pyx
Outdated
if value > overflow_limit or value < -overflow_limit: | ||
raise OverflowError("result would overflow") | ||
|
||
# Note: caller is responsible for re-raising as OutOfBoundsTimedelta |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this comment should go up a line with the OverflowError?
pandas/_libs/tslibs/np_datetime.pyx
Outdated
overflow_limit = INT64_MAX // 7 | ||
if value > overflow_limit or value < -overflow_limit: | ||
raise OverflowError("result would overflow") | ||
return 7 * value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we de-dup some of this with e.g.
if ...
value = get_conversion_factor(...
factor = 7
elif ...
value = get_conversion_factor(...
factor = 24
...
overflow_limit = INT64_MAX // factor
if value > ...
raise OverflowError(...)
return factor * value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea no problem - great idea
tests? |
UBSAN is a runtime check, so all of these were hit from values in the existing test suite. Nothing needs to be added |
Thanks @WillAyd |
* first round of fixes * fix up includes * updates * dedup logic * move comment
Found the first few from running the io test suite. Will be stuck at #55111 which requires closer attention than I wanted to tackle here