-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-39689: _struct: Avoid undefined behavior when loading native _Bool #18925
Conversation
To avoid undefined behavior in the C code, we assume that in every mode, the size is 1, '\x00' is false, and `\x01` is true.
case '?': UNPACK_SINGLE(ld, ptr, _Bool); goto convert_bool; | ||
// memcpy-ing values other than 0 or 1 to a _Bool variable triggers | ||
// undefined behavior, so cast from char instead. See bpo-39689. | ||
case '?': ld = (_Bool)*ptr; goto convert_bool; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory this can trigger an unaligned access if sizeof(_Bool) > 1
. Unaligned accesses that result in bus errors are the main reason for the memcpy()
.
Also I wonder if UB checkers can't get smart enough to figure out that e.g. (_Bool)*ptr == 42
and then complain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory this can trigger an unaligned access if sizeof(_Bool) > 1. Unaligned accesses that result in bus errors are the main reason for the memcpy().
Ah! You are right, thank you. This PR won't work as is. I won't have time to work on this soon, so I'll close it in the mean time.
Also I wonder if UB checkers can't get smart enough to figure out that e.g. (_Bool)*ptr == 42 and then complain.
I don't have access to the C standard, but from what I've read, casting to _Bool will convert non-zero values to 1. So this should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this should be fine.
Yes, I misread the construct as *(_Bool *)ptr)
, I now see that you are deliberately dereferencing the char pointer.
Hmm, the rank of _Bool
is the smallest, but sizeof(char)
can still be larger than sizeof(_Bool)
. So I wouldn't want this construct either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sizeof(char)
is guaranteed to be 1. It can't be larger than sizeof(_Bool)
.
Alignment may be a different matter (or may not; I don't know).
To avoid undefined behavior in the C code, we assume that in
every mode, the size is 1,
\x00
is false, and\x01
is true.With that assumption we can cast from
char
to_Bool
, and avoid UB.https://bugs.python.org/issue39689