Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAOS-13194 gurt: environment APIs hook #12220

Merged
merged 16 commits into from
Jul 13, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/gurt/SConscript
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ def scons():

denv = env.Clone()

denv.AppendUnique(LIBS=['pthread', 'yaml', 'm'])
denv.AppendUnique(LIBS=['pthread', 'yaml', 'm', 'dl'])
denv.require('uuid')

gurt_targets = denv.SharedObject(SRC)
Expand Down
147 changes: 147 additions & 0 deletions src/gurt/misc.c
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,11 @@
#include <stdarg.h>
#include <math.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
#include <dlfcn.h>
#include <pthread.h>

#include <gurt/common.h>

/* state buffer for DAOS rand and srand calls, NOT thread safe */
Expand Down Expand Up @@ -1192,3 +1197,145 @@ d_vec_pointers_append(struct d_vec_pointers *pointers, void *pointer)
pointers->p_len++;
return 0;
}

/**
* Overloads to hook the unsafe getenv()/[un]setenv()/putenv()/clearenv()
* functions from glibc.
* Libgurt is the preferred place for this as it is the lowest layer in DAOS,
* so it will be the earliest to be loaded and will ensure the hook to be
* installed as early as possible and could prevent usage of LD_PRELOAD.
* The idea is to strengthen all the environment APIs by using a common lock.
*
* XXX this will address the main lack of multi-thread protection in the Glibc
* APIs but do not handle all unsafe use-cases (like the change/removal of an
* env var when its value address has already been grabbed by a previous
* getenv(), ...).
*/

static pthread_mutex_t hook_env_lock = PTHREAD_MUTEX_INITIALIZER;
static char *(*real_getenv)(const char *);
static int (*real_putenv)(char *);
static int (*real_setenv)(const char *, const char *, int);
static int (*real_unsetenv)(const char *);
static int (*real_clearenv)(void);

char *getenv(const char *name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the question is I have here is whether this actually works in practice. Do calls to getenv from dependent libraries such as libfabric get intercepted by hooks in libgurt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I have already indicated in DAOS-13194 JiRA/ticket :

I have manually verified that this approach works for the daos/dfuse/dmg/daos_agent/daos_server/daos_engine/daos_server_helper binaries.

{
char *p;

D_MUTEX_LOCK(&hook_env_lock);
if (real_getenv == NULL) {
real_getenv = (char * (*)(const char *))dlsym(RTLD_NEXT, "getenv");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dlsym returns a void *. You don't need to cast it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah... and do you know where does this rule come from ? Early Clang definitions, Kernel developers, ...??

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

void * will alias to anything, it never needs a cast.

if (real_getenv == NULL) {
/* Glibc symbols could not be resolved !!... */
void *handle;

handle = dlopen("libc.so.6", RTLD_LAZY);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libc.so.6 ? I think you should look for the real name no ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, for some (obscure ?) reason "libc.so" is no longer a symbolic link, but a text file (apparently a "dynamic loader script"...) causing dlopen() to fail with "invalid ELF header".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be I should add some more deterministic way to find the correct libc versioned name here, like using scandir() as you suggested... But let's defer this to a future enhancement for now.

D_ASSERT(handle != NULL);
real_getenv = (char * (*)(const char *))dlsym(handle, "getenv");
}
D_ASSERT(real_getenv != NULL);
}

p = real_getenv(name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you separate init from this, the lock here should really only protect the real call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you separate init from this, the lock here should really only protect the real call.

Well, initialisation of the "real_..." variables is racy too, but may be I can switch to using "atomic" variables then ?!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally, I have switched to using Atomics for the initialisation part and also switched to rw-lock usage (instead of simple mutex, mainly to allow concurrent getenv()s) for the exception part.

D_MUTEX_UNLOCK(&hook_env_lock);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this block is repeated across the code 5 times, once for each call, with the only difference being the symbol being looked up and used. Can this be simplified to use a shared function across those calls instead?
Something like

char *getenv(const char *name)
{
   return shared_call(&real_getenv, "getenv");
}

char *setenv(const char *name)
{
   return shared_call(&real_setenv, "setenv");
}

etc...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, you are right!


return p;
}

int putenv(char *name)
{
int rc;

D_MUTEX_LOCK(&hook_env_lock);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might suggest using pthread_once and initializing all of the hooks at once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, you are right !!

if (real_putenv == NULL) {
real_putenv = (int (*)(char *))dlsym(RTLD_NEXT, "putenv");
if (real_putenv == NULL) {
/* Glibc symbols could not be resolved !!... */
void *handle;

handle = dlopen("libc.so.6", RTLD_LAZY);
D_ASSERT(handle != NULL);
real_putenv = (int (*)(char *))dlsym(handle, "putenv");
}
D_ASSERT(real_putenv != NULL);
}

rc = real_putenv(name);
D_MUTEX_UNLOCK(&hook_env_lock);

return rc;
}

int setenv(const char *name, const char *value, int overwrite)
{
int rc;

D_MUTEX_LOCK(&hook_env_lock);
if (real_setenv == NULL) {
real_setenv = (int (*)(const char *, const char *, int))dlsym(RTLD_NEXT, "setenv");
if (real_setenv == NULL) {
/* Glibc symbols could not be resolved !!... */
void *handle;

handle = dlopen("libc.so.6", RTLD_LAZY);
D_ASSERT(handle != NULL);
real_setenv = (int (*)(const char *, const char *, int))dlsym(handle,
"setenv");
}
D_ASSERT(real_setenv != NULL);
}

rc = real_setenv(name, value, overwrite);
D_MUTEX_UNLOCK(&hook_env_lock);

return rc;
}

int unsetenv(const char *name)
{
int rc;

D_MUTEX_LOCK(&hook_env_lock);
if (real_unsetenv == NULL) {
real_unsetenv = (int (*)(const char *))dlsym(RTLD_NEXT, "unsetenv");
if (real_unsetenv == NULL) {
/* Glibc symbols could not be resolved !!... */
void *handle;

handle = dlopen("libc.so.6", RTLD_LAZY);
D_ASSERT(handle != NULL);
real_unsetenv = (int (*)(const char *))dlsym(handle, "unsetenv");
}
D_ASSERT(real_unsetenv != NULL);
}

rc = real_unsetenv(name);
D_MUTEX_UNLOCK(&hook_env_lock);

return rc;
}

int clearenv(void)
{
int rc;

D_MUTEX_LOCK(&hook_env_lock);
if (real_clearenv == NULL) {
real_clearenv = (int (*)(void))dlsym(RTLD_NEXT, "clearenv");
if (real_clearenv == NULL) {
/* Glibc symbols could not be resolved !!... */
void *handle;

handle = dlopen("libc.so.6", RTLD_LAZY);
D_ASSERT(handle != NULL);
real_clearenv = (int (*)(void))dlsym(handle, "clearenv");
}
D_ASSERT(real_clearenv != NULL);
}

rc = real_clearenv();
D_MUTEX_UNLOCK(&hook_env_lock);

return rc;
}
2 changes: 1 addition & 1 deletion src/gurt/tests/SConscript
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ def scons():

test_env = env.Clone()
test_env.require('mercury', 'uuid')
test_env.AppendUnique(LIBS=['pthread', 'cmocka', 'm'])
test_env.AppendUnique(LIBS=['pthread', 'cmocka', 'm', 'dl'])
test_env.AppendUnique(CXXFLAGS=['-std=c++0x'])
tests = []

Expand Down
2 changes: 1 addition & 1 deletion src/tests/ftest/cart/utest/SConscript
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ def scons():

test_env = env.Clone()
test_env.require('mercury', 'uuid', 'cmocka')
test_env.AppendUnique(LIBS=['pthread', 'm'])
test_env.AppendUnique(LIBS=['pthread', 'm', 'dl'])
test_env.AppendUnique(CXXFLAGS=['-std=c++0x'])
test_env.AppendUnique(LIBPATH=LIBPATH)
test_env.AppendUnique(RPATH_FULL=LIBPATH)
Expand Down