You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description of problem:
glusterfs uses various functions from the standard C ctype.h API, such as isalnum and tolower, in a way that likely triggers undefined behaviour.
When the argument to these functions is neither (a) a value representable by the type unsigned char, nor (b) the value of the macro EOF, the behaviour is undefined:
In all cases the argument is an int, the value of which shall be representable as an unsigned char or shall equal the value of the macro EOF. If the argument has any other value, the behavior is undefined.
C11, Sec. 7.4 Character handling <ctype.h>, p. 200, clause 1
This is because the ctype.h functions designed to work with the int values returned by getc or fgetc, not for processing elements of an arbitrary string stored in a char array. Safely processing arbitrary strings that way requires an explicit cast to unsigned char first so the values lie in the defined domain of the ctype.h functions.
If EOF is -1 (as it almost always is), with 8-bit bytes, the allowed inputs to the ctype.h functions are:
{-1, 0, 1, 2, 3, ..., 255}.
However, on platforms where char is signed, such as x86 with the usual ABI, code like
char *p = ...;
... isspace(*p) ...
may pass in values in the range:
{-128, -127, -126, ..., -2, -1, 0, 1, ..., 127}.
This has two problems:
Inputs in the set {-128, -127, -126, ..., -2} are forbidden -- they lead to undefined behaviour.
The non-EOF byte 0xff is conflated with the value EOF = -1, so even though the input is not forbidden, it may give the wrong answer.
In order to avoid the undefined behaviour for all inputs, the above code needs to be rewritten to cast the char to unsigned char so that its value lies in the defined domain of the ctype.h functions:
char *p = ...;
... isspace((unsigned char)*p) ...
The effect of the undefined behaviour may be wrong answers from dereferencing negative array offsets into other random heap data, crashes from dereferencing negative array offsets into unmapped oblivion, or demons flying out of your nose.
I found various cases of passing char into the ctype.h functions throughout glusterfs, such as:
While I don't know whether any of these will definitely ever process inputs outside the US-ASCII range {0, 1, 2, ..., 127} which is safe on most ABIs, having to review the context makes a reviewer or auditor's job difficult -- and code has a tendency to propagate and find new users in new contexts, each of which requires further review or auditing to ensure it is safe. Much simpler is to just always, as a rule, cast char to unsigned char for ctype.h function inputs.
The exact command to reproduce the issue:
code inspection, or ./configure && make on platforms designed to warn about this issue
The full output of the command that failed:
Example of a warning about this issue, from NetBSD:
...
glfs.c: In function 'pub_glfs_new':
glfs.c:864:29: warning: array subscript has type 'char' [-Wchar-subscripts]
864 | if (!isalnum(volname[i]) && (volname[i] != '_') &&
| ^
...
(The warning arises from the macros used to implement the ctype.h functions in NetBSD, which expand to references in order to deliberately trigger a compiler warning. This isn't because the NetBSD definitions are broken -- it's just a trick to make the compiler flag likely abuse of the ctype.h functions which may lead at run-time to undefined behaviour.)
Expected results:
No calls to the ctype.h functions passing values of type char.
Mandatory info: - The output of the gluster volume info command:
N/A
- The output of the gluster volume status command:
N/A
- The output of the gluster volume heal command:
N/A
**- Provide logs present on following locations of client and server nodes -
/var/log/glusterfs/
N/A
**- Is there any crash ? Provide the backtrace and coredump
N/A
Additional info:
- The operating system / glusterfs version:
glusterfs 10.4
The text was updated successfully, but these errors were encountered:
Description of problem:
glusterfs uses various functions from the standard C ctype.h API, such as isalnum and tolower, in a way that likely triggers undefined behaviour.
When the argument to these functions is neither (a) a value representable by the type unsigned char, nor (b) the value of the macro
EOF
, the behaviour is undefined:This is because the ctype.h functions designed to work with the int values returned by getc or fgetc, not for processing elements of an arbitrary string stored in a char array. Safely processing arbitrary strings that way requires an explicit cast to unsigned char first so the values lie in the defined domain of the ctype.h functions.
If
EOF
is -1 (as it almost always is), with 8-bit bytes, the allowed inputs to the ctype.h functions are:{-1, 0, 1, 2, 3, ..., 255}.
However, on platforms where char is signed, such as x86 with the usual ABI, code like
may pass in values in the range:
{-128, -127, -126, ..., -2, -1, 0, 1, ..., 127}.
This has two problems:
Inputs in the set {-128, -127, -126, ..., -2} are forbidden -- they lead to undefined behaviour.
The non-
EOF
byte 0xff is conflated with the valueEOF
= -1, so even though the input is not forbidden, it may give the wrong answer.In order to avoid the undefined behaviour for all inputs, the above code needs to be rewritten to cast the char to unsigned char so that its value lies in the defined domain of the ctype.h functions:
The effect of the undefined behaviour may be wrong answers from dereferencing negative array offsets into other random heap data, crashes from dereferencing negative array offsets into unmapped oblivion, or demons flying out of your nose.
I found various cases of passing char into the ctype.h functions throughout glusterfs, such as:
glusterfs/api/src/glfs.c
Line 853 in ea377b9
glusterfs/xlators/debug/io-stats/src/io-stats.c
Line 825 in ea377b9
glusterfs/geo-replication/src/procdiggy.c
Line 58 in ea377b9
While I don't know whether any of these will definitely ever process inputs outside the US-ASCII range {0, 1, 2, ..., 127} which is safe on most ABIs, having to review the context makes a reviewer or auditor's job difficult -- and code has a tendency to propagate and find new users in new contexts, each of which requires further review or auditing to ensure it is safe. Much simpler is to just always, as a rule, cast char to unsigned char for ctype.h function inputs.
The exact command to reproduce the issue:
code inspection, or ./configure && make on platforms designed to warn about this issue
The full output of the command that failed:
Example of a warning about this issue, from NetBSD:
(The warning arises from the macros used to implement the ctype.h functions in NetBSD, which expand to references in order to deliberately trigger a compiler warning. This isn't because the NetBSD definitions are broken -- it's just a trick to make the compiler flag likely abuse of the ctype.h functions which may lead at run-time to undefined behaviour.)
Expected results:
No calls to the ctype.h functions passing values of type char.
Mandatory info:
- The output of the
gluster volume info
command:N/A
- The output of the
gluster volume status
command:N/A
- The output of the
gluster volume heal
command:N/A
**- Provide logs present on following locations of client and server nodes -
/var/log/glusterfs/
N/A
**- Is there any crash ? Provide the backtrace and coredump
N/A
Additional info:
- The operating system / glusterfs version:
The text was updated successfully, but these errors were encountered: