ctypes: We do not correctly handle NULL dlsym() return values #126554

grgalex · 2024-11-07T20:17:01Z

Bug report

Bug description:

The man(3) page for dlsym() states:

In unusual cases (see NOTES) the value of the symbol  could  actually  be
NULL. Therefore, a NULL return from dlsym() need not indicate an error.
The correct way to distinguish an error from a symbol whose value is NULL
is  to  call  dlerror(3) to clear any old error conditions, then call dl‐
sym(), and then call dlerror(3) again, saving its  return  value  into  a
variable, and check whether this saved value is not NULL.

As such, there can be cases where a call to dlsym returns NULL, while no error has been encountered and the string that dlerror() returns has not been (re)set.

Currently, (https://github.com/python/cpython/blob/main/Modules/_ctypes/_ctypes.c#L970) we do:

<...>
address = (void *)dlsym(handle, name);
if (!address) {
#ifdef __CYGWIN__
<snip Windows stuff>
#else
        PyErr_SetString(PyExc_ValueError, dlerror());
<...>

If dlsym() returns NULL, then by calling dlerror() we pass either NULL or a previous, unconsumed error string to PyErr_SetString.

In the context of ctypes, a NULL return by dlsym() might indeed always be considered an error.

To correctly express this, we should:

Call dlerror() before dlsym(), to clear any previous error
Call dlerror() after dlsym(), to see if dlsym() encountered any error
- If dlerror() returns a non-NULL value, then pass its result to PyErr_SetString
- If dlerror() returns NULL, then check if address is NULL, and if so, pass a custom error string to PyErr_SetString.

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Linked PRs

The text was updated successfully, but these errors were encountered:

efimov-mikhail · 2024-11-07T20:47:00Z

Is it actually a bug?
Can we make a repro?

grgalex · 2024-11-07T21:09:44Z

EDIT: See Reproducer below.

grgalex · 2024-11-07T21:34:43Z

I managed to create a reproducer using Indirect Functions (IFUNCS) by adapting this SO answer: https://stackoverflow.com/a/53590014

Create a file named foo.c with the following contents:

#include <stdio.h>

/* This is a 'GNU indirect function' (IFUNC) that will be called by
   dlsym() to resolve the symbol "foo" to an address. Typically, such
   a function would return the address of an actual function, but it
   can also just return NULL.  For some background on IFUNCs, see
   https://willnewton.name/uncategorized/using-gnu-indirect-functions/ */

asm (".type foo, @gnu_indirect_function");

void *
foo(void)
{
    fprintf(stderr, "foo called\n");
    return NULL;
}

Create a file named main.c with the following contents:

#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>

int
main(int argc, char *argv[])
{
    void *handle;
    void (*funcp)(void);

    handle  = dlopen("./foo.so", RTLD_LAZY);
    if (handle == NULL) {
        fprintf(stderr, "dlopen: %s\n", dlerror());
        exit(EXIT_FAILURE);
    }

    dlerror();      /* Clear any outstanding error */

    funcp = dlsym(handle, "foo");

    printf("Results after dlsym(): funcp = %p; dlerror = %s\n",
            (void *) funcp, dlerror());

    exit(EXIT_SUCCESS);
}

Compile foo.c into a shared library named libfoo.so:

gcc -Wall -fPIC -shared -o libfoo.so foo.c

Compile the main executable:

gcc -Wall -o main main.c -L. -lfoo -ldl

Run main:

./main
foo called ### This is from the IFUNC ###
Results after dlsym(): funcp = (nil); dlerror = (null)

grgalex · 2024-11-07T21:37:57Z

When using ctypes to search for symbol foo in libfoo.so we get a Segmentation Fault.
(Tested on Python 3.10 from my system, as well as one built from main.)

python3 -c 'import ctypes; C = ctypes.CDLL("./libfoo.so"); C.foo'

foo called
Segmentation fault (core dumped)

So this turned out to be a bug after all!

grgalex · 2024-11-07T21:40:32Z

With the interpreter built from my PR we instead get:

./python -c 'import ctypes; C = ctypes.CDLL("./libfoo.so"); C.foo'

foo called
Traceback (most recent call last):
  File "<string>", line 1, in <module>
    import ctypes; C = ctypes.CDLL("./libfoo.so"); C.foo
                                                   ^^^^^
  File "/home/grg/a-lab/cpython/Lib/ctypes/__init__.py", line 415, in __getattr__
    func = self.__getitem__(name)
  File "/home/grg/a-lab/cpython/Lib/ctypes/__init__.py", line 420, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: function 'foo' not found

grgalex · 2024-11-07T21:44:05Z

cc: @efimov-mikhail

grgalex · 2024-11-08T14:00:18Z

Fun fact:

The StackOverflow answer, which I adapted to form the reproducer is actually written by Michael Kerrisk [1].

So, he was the one who came up to clarify his statements in the man page, when noone else {c,w}ould :)

[1] https://en.wikipedia.org/wiki/Michael_Kerrisk

For dlsym(), a return value of NULL does not necessarily indicate an error [1]. Therefore, to avoid using stale (or NULL) dlerror() values, we must: 1. clear the previous error state by calling dlerror() 2. call dlsym() 3. call dlerror() If the return value of dlerror() is not NULL, an error occured. In ctypes we choose to treat a NULL return value from dlsym() as a "not found" error. This is the same as the fallback message we use on Windows, Cygwin or when getting/formatting the error reason fails. [1]: https://man7.org/linux/man-pages/man3/dlsym.3.html Signed-off-by: Georgios Alexopoulos <[email protected]> Signed-off-by: Georgios Alexopoulos <[email protected]> Co-authored-by: Peter Bierma <[email protected]> Co-authored-by: Bénédikt Tran <[email protected]> Co-authored-by: Petr Viktorin <[email protected]>

…-126555) For dlsym(), a return value of NULL does not necessarily indicate an error [1]. Therefore, to avoid using stale (or NULL) dlerror() values, we must: 1. clear the previous error state by calling dlerror() 2. call dlsym() 3. call dlerror() If the return value of dlerror() is not NULL, an error occured. In ctypes we choose to treat a NULL return value from dlsym() as a "not found" error. This is the same as the fallback message we use on Windows, Cygwin or when getting/formatting the error reason fails. [1]: https://man7.org/linux/man-pages/man3/dlsym.3.html (cherry picked from commit 8717f79) Co-authored-by: George Alexopoulos <[email protected]> Signed-off-by: Georgios Alexopoulos <[email protected]> Signed-off-by: Georgios Alexopoulos <[email protected]> Co-authored-by: Peter Bierma <[email protected]> Co-authored-by: Bénédikt Tran <[email protected]> Co-authored-by: Petr Viktorin <[email protected]>

encukou · 2024-11-15T10:20:28Z

Thanks @grgalex for the fix, and @ZeroIntensity & @picnixz for the thorough reviews!

When using ctypes to search for symbol foo in libfoo.so we get a Segmentation Fault.
[...]
So this turned out to be a bug after all!

Well, when a symbol you asked for points to uninitialized memory (like a NULL), a segfault seems rather reasonable to me. (Granted, it does happen for the wrong reasons.)

Treating NULL specially could possibly be considered a feature, rather than a bug.
Given how hard this is to reproduce, I wouldn't mind having the fix to 3.14 only, and omitting the backports.

…) (#126861) For dlsym(), a return value of NULL does not necessarily indicate an error [1]. Therefore, to avoid using stale (or NULL) dlerror() values, we must: 1. clear the previous error state by calling dlerror() 2. call dlsym() 3. call dlerror() If the return value of dlerror() is not NULL, an error occured. In ctypes we choose to treat a NULL return value from dlsym() as a "not found" error. This is the same as the fallback message we use on Windows, Cygwin or when getting/formatting the error reason fails. [1]: https://man7.org/linux/man-pages/man3/dlsym.3.html (cherry picked from commit 8717f79) Signed-off-by: Georgios Alexopoulos <[email protected]> Co-authored-by: George Alexopoulos <[email protected]> Co-authored-by: Peter Bierma <[email protected]> Co-authored-by: Bénédikt Tran <[email protected]> Co-authored-by: Petr Viktorin <[email protected]>

…-126555) For dlsym(), a return value of NULL does not necessarily indicate an error [1]. Therefore, to avoid using stale (or NULL) dlerror() values, we must: 1. clear the previous error state by calling dlerror() 2. call dlsym() 3. call dlerror() If the return value of dlerror() is not NULL, an error occured. In ctypes we choose to treat a NULL return value from dlsym() as a "not found" error. This is the same as the fallback message we use on Windows, Cygwin or when getting/formatting the error reason fails. [1]: https://man7.org/linux/man-pages/man3/dlsym.3.html Signed-off-by: Georgios Alexopoulos <[email protected]> Signed-off-by: Georgios Alexopoulos <[email protected]> Co-authored-by: Peter Bierma <[email protected]> Co-authored-by: Bénédikt Tran <[email protected]> Co-authored-by: Petr Viktorin <[email protected]>

grgalex added the type-bug An unexpected behavior, bug, or error label Nov 7, 2024

bedevere-app bot mentioned this issue Nov 7, 2024

gh-126554: ctypes: Correctly handle NULL dlsym values #126555

Merged

ZeroIntensity added topic-ctypes extension-modules C modules in the Modules dir 3.12 bugs and security fixes 3.13 bugs and security fixes 3.14 new features, bugs and security fixes labels Nov 7, 2024

bedevere-app bot mentioned this issue Nov 15, 2024

[3.13] gh-126554: ctypes: Correctly handle NULL dlsym values (GH-126555) #126861

Merged

github-actions bot mentioned this issue Dec 1, 2024

Monthly issue metrics report hugovk/test#88

Closed

encukou closed this as completed Dec 9, 2024

picnixz mentioned this issue Dec 9, 2024

[3.12] gh-126554: ctypes: Correctly handle NULL dlsym values (GH-126555) #127764

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ctypes: We do not correctly handle NULL dlsym() return values #126554

ctypes: We do not correctly handle NULL dlsym() return values #126554

grgalex commented Nov 7, 2024 •

edited by bedevere-app bot

Loading

efimov-mikhail commented Nov 7, 2024

grgalex commented Nov 7, 2024 •

edited

Loading

grgalex commented Nov 7, 2024 •

edited

Loading

grgalex commented Nov 7, 2024 •

edited

Loading

grgalex commented Nov 7, 2024

grgalex commented Nov 7, 2024

grgalex commented Nov 8, 2024

encukou commented Nov 15, 2024

ctypes: We do not correctly handle NULL dlsym() return values #126554

ctypes: We do not correctly handle NULL dlsym() return values #126554

Comments

grgalex commented Nov 7, 2024 • edited by bedevere-app bot Loading

Bug report

Bug description:

CPython versions tested on:

Operating systems tested on:

Linked PRs

efimov-mikhail commented Nov 7, 2024

grgalex commented Nov 7, 2024 • edited Loading

grgalex commented Nov 7, 2024 • edited Loading

grgalex commented Nov 7, 2024 • edited Loading

grgalex commented Nov 7, 2024

grgalex commented Nov 7, 2024

grgalex commented Nov 8, 2024

encukou commented Nov 15, 2024

grgalex commented Nov 7, 2024 •

edited by bedevere-app bot

Loading

grgalex commented Nov 7, 2024 •

edited

Loading

grgalex commented Nov 7, 2024 •

edited

Loading

grgalex commented Nov 7, 2024 •

edited

Loading