Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Likely misuse of ctype.h API #4397

Open
riastradh opened this issue Aug 6, 2024 · 0 comments
Open

Likely misuse of ctype.h API #4397

riastradh opened this issue Aug 6, 2024 · 0 comments

Comments

@riastradh
Copy link

Description of problem:
glusterfs uses various functions from the standard C ctype.h API, such as isalnum and tolower, in a way that likely triggers undefined behaviour.

When the argument to these functions is neither (a) a value representable by the type unsigned char, nor (b) the value of the macro EOF, the behaviour is undefined:

In all cases the argument is an int, the value of which shall be representable as an unsigned char or shall equal the value of the macro EOF. If the argument has any other value, the behavior is undefined.

C11, Sec. 7.4 Character handling <ctype.h>, p. 200, clause 1

This is because the ctype.h functions designed to work with the int values returned by getc or fgetc, not for processing elements of an arbitrary string stored in a char array. Safely processing arbitrary strings that way requires an explicit cast to unsigned char first so the values lie in the defined domain of the ctype.h functions.

If EOF is -1 (as it almost always is), with 8-bit bytes, the allowed inputs to the ctype.h functions are:

{-1, 0, 1, 2, 3, ..., 255}.

However, on platforms where char is signed, such as x86 with the usual ABI, code like

char *p = ...;
... isspace(*p) ...

may pass in values in the range:

{-128, -127, -126, ..., -2, -1, 0, 1, ..., 127}.

This has two problems:

  1. Inputs in the set {-128, -127, -126, ..., -2} are forbidden -- they lead to undefined behaviour.

  2. The non-EOF byte 0xff is conflated with the value EOF = -1, so even though the input is not forbidden, it may give the wrong answer.

In order to avoid the undefined behaviour for all inputs, the above code needs to be rewritten to cast the char to unsigned char so that its value lies in the defined domain of the ctype.h functions:

char *p = ...;
... isspace((unsigned char)*p) ...

The effect of the undefined behaviour may be wrong answers from dereferencing negative array offsets into other random heap data, crashes from dereferencing negative array offsets into unmapped oblivion, or demons flying out of your nose.

I found various cases of passing char into the ctype.h functions throughout glusterfs, such as:

if (!isalnum(volname[i]) && (volname[i] != '_') &&

lc_fop_name[j] = tolower(lc_fop_name[j]);

while (isspace(*++p))

While I don't know whether any of these will definitely ever process inputs outside the US-ASCII range {0, 1, 2, ..., 127} which is safe on most ABIs, having to review the context makes a reviewer or auditor's job difficult -- and code has a tendency to propagate and find new users in new contexts, each of which requires further review or auditing to ensure it is safe. Much simpler is to just always, as a rule, cast char to unsigned char for ctype.h function inputs.

The exact command to reproduce the issue:
code inspection, or ./configure && make on platforms designed to warn about this issue

The full output of the command that failed:
Example of a warning about this issue, from NetBSD:

...
glfs.c: In function 'pub_glfs_new':
glfs.c:864:29: warning: array subscript has type 'char' [-Wchar-subscripts]
  864 |         if (!isalnum(volname[i]) && (volname[i] != '_') &&
      |                             ^
...

(The warning arises from the macros used to implement the ctype.h functions in NetBSD, which expand to references in order to deliberately trigger a compiler warning. This isn't because the NetBSD definitions are broken -- it's just a trick to make the compiler flag likely abuse of the ctype.h functions which may lead at run-time to undefined behaviour.)

Expected results:
No calls to the ctype.h functions passing values of type char.

Mandatory info:
- The output of the gluster volume info command:
N/A

- The output of the gluster volume status command:
N/A

- The output of the gluster volume heal command:
N/A

**- Provide logs present on following locations of client and server nodes -
/var/log/glusterfs/
N/A

**- Is there any crash ? Provide the backtrace and coredump
N/A

Additional info:

- The operating system / glusterfs version:

  • glusterfs 10.4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant