Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ctypes: We do not correctly handle NULL dlsym() return values #126554

Closed
grgalex opened this issue Nov 7, 2024 · 8 comments
Closed

ctypes: We do not correctly handle NULL dlsym() return values #126554

grgalex opened this issue Nov 7, 2024 · 8 comments
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes 3.14 new features, bugs and security fixes extension-modules C modules in the Modules dir topic-ctypes type-bug An unexpected behavior, bug, or error

Comments

@grgalex
Copy link
Contributor

grgalex commented Nov 7, 2024

Bug report

Bug description:

The man(3) page for dlsym() states:

In unusual cases (see NOTES) the value of the symbol  could  actually  be
NULL. Therefore, a NULL return from dlsym() need not indicate an error.
The correct way to distinguish an error from a symbol whose value is NULL
is  to  call  dlerror(3) to clear any old error conditions, then call dl‐
sym(), and then call dlerror(3) again, saving its  return  value  into  a
variable, and check whether this saved value is not NULL.

As such, there can be cases where a call to dlsym returns NULL, while no error has been encountered and the string that dlerror() returns has not been (re)set.

Currently, (https://github.com/python/cpython/blob/main/Modules/_ctypes/_ctypes.c#L970) we do:

<...>
address = (void *)dlsym(handle, name);
if (!address) {
#ifdef __CYGWIN__
<snip Windows stuff>
#else
        PyErr_SetString(PyExc_ValueError, dlerror());
<...>

If dlsym() returns NULL, then by calling dlerror() we pass either NULL or a previous, unconsumed error string to PyErr_SetString.

In the context of ctypes, a NULL return by dlsym() might indeed always be considered an error.

To correctly express this, we should:

  • Call dlerror() before dlsym(), to clear any previous error
  • Call dlerror() after dlsym(), to see if dlsym() encountered any error
    • If dlerror() returns a non-NULL value, then pass its result to PyErr_SetString
    • If dlerror() returns NULL, then check if address is NULL, and if so, pass a custom error string to PyErr_SetString.

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Linked PRs

@grgalex grgalex added the type-bug An unexpected behavior, bug, or error label Nov 7, 2024
@efimov-mikhail
Copy link
Contributor

Is it actually a bug?
Can we make a repro?

@grgalex
Copy link
Contributor Author

grgalex commented Nov 7, 2024

EDIT: See Reproducer below.

@grgalex
Copy link
Contributor Author

grgalex commented Nov 7, 2024

I managed to create a reproducer using Indirect Functions (IFUNCS) by adapting this SO answer: https://stackoverflow.com/a/53590014

  1. Create a file named foo.c with the following contents:
#include <stdio.h>

/* This is a 'GNU indirect function' (IFUNC) that will be called by
   dlsym() to resolve the symbol "foo" to an address. Typically, such
   a function would return the address of an actual function, but it
   can also just return NULL.  For some background on IFUNCs, see
   https://willnewton.name/uncategorized/using-gnu-indirect-functions/ */

asm (".type foo, @gnu_indirect_function");

void *
foo(void)
{
    fprintf(stderr, "foo called\n");
    return NULL;
}
  1. Create a file named main.c with the following contents:
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>

int
main(int argc, char *argv[])
{
    void *handle;
    void (*funcp)(void);

    handle  = dlopen("./foo.so", RTLD_LAZY);
    if (handle == NULL) {
        fprintf(stderr, "dlopen: %s\n", dlerror());
        exit(EXIT_FAILURE);
    }

    dlerror();      /* Clear any outstanding error */

    funcp = dlsym(handle, "foo");

    printf("Results after dlsym(): funcp = %p; dlerror = %s\n",
            (void *) funcp, dlerror());

    exit(EXIT_SUCCESS);
}
  1. Compile foo.c into a shared library named libfoo.so:
gcc -Wall -fPIC -shared -o libfoo.so foo.c
  1. Compile the main executable:
gcc -Wall -o main main.c -L. -lfoo -ldl
  1. Run main:
./main
foo called ### This is from the IFUNC ###
Results after dlsym(): funcp = (nil); dlerror = (null)

@grgalex
Copy link
Contributor Author

grgalex commented Nov 7, 2024

When using ctypes to search for symbol foo in libfoo.so we get a Segmentation Fault.
(Tested on Python 3.10 from my system, as well as one built from main.)

python3 -c 'import ctypes; C = ctypes.CDLL("./libfoo.so"); C.foo'

foo called
Segmentation fault (core dumped)

So this turned out to be a bug after all!

@grgalex
Copy link
Contributor Author

grgalex commented Nov 7, 2024

With the interpreter built from my PR we instead get:

./python -c 'import ctypes; C = ctypes.CDLL("./libfoo.so"); C.foo'

foo called
Traceback (most recent call last):
  File "<string>", line 1, in <module>
    import ctypes; C = ctypes.CDLL("./libfoo.so"); C.foo
                                                   ^^^^^
  File "/home/grg/a-lab/cpython/Lib/ctypes/__init__.py", line 415, in __getattr__
    func = self.__getitem__(name)
  File "/home/grg/a-lab/cpython/Lib/ctypes/__init__.py", line 420, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: function 'foo' not found

@grgalex
Copy link
Contributor Author

grgalex commented Nov 7, 2024

cc: @efimov-mikhail

@ZeroIntensity ZeroIntensity added topic-ctypes extension-modules C modules in the Modules dir 3.12 bugs and security fixes 3.13 bugs and security fixes 3.14 new features, bugs and security fixes labels Nov 7, 2024
@grgalex
Copy link
Contributor Author

grgalex commented Nov 8, 2024

Fun fact:

The StackOverflow answer, which I adapted to form the reproducer is actually written by Michael Kerrisk [1].

So, he was the one who came up to clarify his statements in the man page, when noone else {c,w}ould :)

[1] https://en.wikipedia.org/wiki/Michael_Kerrisk

encukou added a commit that referenced this issue Nov 15, 2024
For dlsym(), a return value of NULL does not necessarily indicate
an error [1].

Therefore, to avoid using stale (or NULL) dlerror() values, we must:

 1. clear the previous error state by calling dlerror()
 2. call dlsym()
 3. call dlerror()

If the return value of dlerror() is not NULL, an error occured.

In ctypes we choose to treat a NULL return value from dlsym()
as a "not found" error. This is the same as the fallback
message we use on Windows, Cygwin or when getting/formatting
the error reason fails.

[1]: https://man7.org/linux/man-pages/man3/dlsym.3.html

Signed-off-by: Georgios Alexopoulos <[email protected]>
Signed-off-by: Georgios Alexopoulos <[email protected]>
Co-authored-by: Peter Bierma <[email protected]>
Co-authored-by: Bénédikt Tran <[email protected]>
Co-authored-by: Petr Viktorin <[email protected]>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Nov 15, 2024
…-126555)

For dlsym(), a return value of NULL does not necessarily indicate
an error [1].

Therefore, to avoid using stale (or NULL) dlerror() values, we must:

 1. clear the previous error state by calling dlerror()
 2. call dlsym()
 3. call dlerror()

If the return value of dlerror() is not NULL, an error occured.

In ctypes we choose to treat a NULL return value from dlsym()
as a "not found" error. This is the same as the fallback
message we use on Windows, Cygwin or when getting/formatting
the error reason fails.

[1]: https://man7.org/linux/man-pages/man3/dlsym.3.html

(cherry picked from commit 8717f79)

Co-authored-by: George Alexopoulos <[email protected]>
Signed-off-by: Georgios Alexopoulos <[email protected]>
Signed-off-by: Georgios Alexopoulos <[email protected]>
Co-authored-by: Peter Bierma <[email protected]>
Co-authored-by: Bénédikt Tran <[email protected]>
Co-authored-by: Petr Viktorin <[email protected]>
@encukou
Copy link
Member

encukou commented Nov 15, 2024

Thanks @grgalex for the fix, and @ZeroIntensity & @picnixz for the thorough reviews!

When using ctypes to search for symbol foo in libfoo.so we get a Segmentation Fault.
[...]
So this turned out to be a bug after all!

Well, when a symbol you asked for points to uninitialized memory (like a NULL), a segfault seems rather reasonable to me. (Granted, it does happen for the wrong reasons.)

Treating NULL specially could possibly be considered a feature, rather than a bug.
Given how hard this is to reproduce, I wouldn't mind having the fix to 3.14 only, and omitting the backports.

ambv pushed a commit that referenced this issue Nov 17, 2024
…) (#126861)

For dlsym(), a return value of NULL does not necessarily indicate
an error [1].

Therefore, to avoid using stale (or NULL) dlerror() values, we must:

 1. clear the previous error state by calling dlerror()
 2. call dlsym()
 3. call dlerror()

If the return value of dlerror() is not NULL, an error occured.

In ctypes we choose to treat a NULL return value from dlsym()
as a "not found" error. This is the same as the fallback
message we use on Windows, Cygwin or when getting/formatting
the error reason fails.

[1]: https://man7.org/linux/man-pages/man3/dlsym.3.html

(cherry picked from commit 8717f79)

Signed-off-by: Georgios Alexopoulos <[email protected]>
Co-authored-by: George Alexopoulos <[email protected]>
Co-authored-by: Peter Bierma <[email protected]>
Co-authored-by: Bénédikt Tran <[email protected]>
Co-authored-by: Petr Viktorin <[email protected]>
picnixz added a commit to picnixz/cpython that referenced this issue Dec 8, 2024
…-126555)

For dlsym(), a return value of NULL does not necessarily indicate
an error [1].

Therefore, to avoid using stale (or NULL) dlerror() values, we must:

 1. clear the previous error state by calling dlerror()
 2. call dlsym()
 3. call dlerror()

If the return value of dlerror() is not NULL, an error occured.

In ctypes we choose to treat a NULL return value from dlsym()
as a "not found" error. This is the same as the fallback
message we use on Windows, Cygwin or when getting/formatting
the error reason fails.

[1]: https://man7.org/linux/man-pages/man3/dlsym.3.html

Signed-off-by: Georgios Alexopoulos <[email protected]>
Signed-off-by: Georgios Alexopoulos <[email protected]>
Co-authored-by: Peter Bierma <[email protected]>
Co-authored-by: Bénédikt Tran <[email protected]>
Co-authored-by: Petr Viktorin <[email protected]>
@encukou encukou closed this as completed Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes 3.14 new features, bugs and security fixes extension-modules C modules in the Modules dir topic-ctypes type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants