Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAOS-13194 gurt: environment APIs hook #12220

Merged
merged 16 commits into from
Jul 13, 2023
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/gurt/SConscript
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ def scons():

denv = env.Clone()

denv.AppendUnique(LIBS=['pthread', 'yaml', 'm'])
denv.AppendUnique(LIBS=['pthread', 'yaml', 'm', 'dl'])
denv.require('uuid')

gurt_targets = denv.SharedObject(SRC)
Expand Down
122 changes: 122 additions & 0 deletions src/gurt/misc.c
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,13 @@
#include <stdarg.h>
#include <math.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
#include <dlfcn.h>
#include <pthread.h>

#include <gurt/common.h>
#include <gurt/atomic.h>

/* state buffer for DAOS rand and srand calls, NOT thread safe */
static struct drand48_data randBuffer = {0};
Expand Down Expand Up @@ -1197,3 +1203,119 @@ d_vec_pointers_append(struct d_vec_pointers *pointers, void *pointer)
pointers->p_len++;
return 0;
}

/**
* Overloads to hook the unsafe getenv()/[un]setenv()/putenv()/clearenv()
* functions from glibc.
* Libgurt is the preferred place for this as it is the lowest layer in DAOS,
* so it will be the earliest to be loaded and will ensure the hook to be
* installed as early as possible and could prevent usage of LD_PRELOAD.
* The idea is to strengthen all the environment APIs by using a common lock.
*
* XXX this will address the main lack of multi-thread protection in the Glibc
* APIs but do not handle all unsafe use-cases (like the change/removal of an
* env var when its value address has already been grabbed by a previous
* getenv(), ...).
*/

static pthread_rwlock_t hook_env_lock = PTHREAD_RWLOCK_INITIALIZER;
static char *(* ATOMIC real_getenv)(const char *);
static int (* ATOMIC real_putenv)(char *);
static int (* ATOMIC real_setenv)(const char *, const char *, int);
static int (* ATOMIC real_unsetenv)(const char *);
static int (* ATOMIC real_clearenv)(void);

static void bind_libc_symbol(void **real_ptr_addr, const char *name)
{
void *real_temp;

/* XXX __atomic_*() built-ins are used to avoid the need to cast
* each of the ATOMIC pointers of functions, that seems to be
* required to make Intel compiler happy ...
*/
if (__atomic_load_n(real_ptr_addr, __ATOMIC_RELAXED) == NULL) {
/* libc should be already loaded ... */
real_temp = dlsym(RTLD_NEXT, name);
Copy link
Contributor

@jolivier23 jolivier23 Jun 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this doesn't find libc version? Seems like it should always find it since libc should be linked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this doesn't find libc version? Seems like it should always find it since libc should be linked.

I was also convinced that libc should always be binded, but if you look to the CI failures of the 2nd commit in this PR (https://build.hpdd.intel.com/blue/organizations/jenkins/daos-stack%2Fdaos/detail/PR-12220/2/pipeline/777/) there are multiple ftests that could fail because dlsym() was not able to resolve "getenv" symbol. May be this is a python only issue...

if (real_temp == NULL) {
/* try after loading libc now */
void *handle;

handle = dlopen("libc.so.6", RTLD_LAZY);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libc.so.6 ? I think you should look for the real name no ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, for some (obscure ?) reason "libc.so" is no longer a symbolic link, but a text file (apparently a "dynamic loader script"...) causing dlopen() to fail with "invalid ELF header".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be I should add some more deterministic way to find the correct libc versioned name here, like using scandir() as you suggested... But let's defer this to a future enhancement for now.

D_ASSERT(handle != NULL);
real_temp = dlsym(handle, name);
D_ASSERT(real_temp != NULL);
}
__atomic_store_n(real_ptr_addr, real_temp, __ATOMIC_RELAXED);
}
}

static pthread_once_t init_real_symbols_flag = PTHREAD_ONCE_INIT;

static void init_real_symbols(void)
{
bind_libc_symbol((void **)&real_getenv, "getenv");
bind_libc_symbol((void **)&real_putenv, "putenv");
bind_libc_symbol((void **)&real_setenv, "setenv");
bind_libc_symbol((void **)&real_unsetenv, "unsetenv");
bind_libc_symbol((void **)&real_clearenv, "clearenv");
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like you want to use contructors instead using __attribute__((constructor)) ? I'm not sure why you'd use pthread_once etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why you'd use pthread_once etc.

The idea is to initialise all the function pointers to the real locations all in a raw but once per thread.

I feel like you want to use contructors instead using attribute((constructor)) ?

Not sure I get this, but if you mean why I did not use "attribute((constructor))" finally, this because libgurt is always loaded as one of the first libs with DAOS, so there should be no particular need of a library constructor to perform the initialisation stuff.


char *getenv(const char *name)
{
char *p;

pthread_once(&init_real_symbols_flag, init_real_symbols);
D_RWLOCK_RDLOCK(&hook_env_lock);
p = real_getenv(name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you separate init from this, the lock here should really only protect the real call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you separate init from this, the lock here should really only protect the real call.

Well, initialisation of the "real_..." variables is racy too, but may be I can switch to using "atomic" variables then ?!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally, I have switched to using Atomics for the initialisation part and also switched to rw-lock usage (instead of simple mutex, mainly to allow concurrent getenv()s) for the exception part.

D_RWLOCK_UNLOCK(&hook_env_lock);

return p;
}

int putenv(char *name)
{
int rc;

pthread_once(&init_real_symbols_flag, init_real_symbols);
D_RWLOCK_WRLOCK(&hook_env_lock);
rc = real_putenv(name);
D_RWLOCK_UNLOCK(&hook_env_lock);

return rc;
}

int setenv(const char *name, const char *value, int overwrite)
{
int rc;

pthread_once(&init_real_symbols_flag, init_real_symbols);
D_RWLOCK_WRLOCK(&hook_env_lock);
rc = real_setenv(name, value, overwrite);
D_RWLOCK_UNLOCK(&hook_env_lock);

return rc;
}

int unsetenv(const char *name)
{
int rc;

pthread_once(&init_real_symbols_flag, init_real_symbols);
D_RWLOCK_WRLOCK(&hook_env_lock);
rc = real_unsetenv(name);
D_RWLOCK_UNLOCK(&hook_env_lock);

return rc;
}

int clearenv(void)
{
int rc;

pthread_once(&init_real_symbols_flag, init_real_symbols);
D_RWLOCK_WRLOCK(&hook_env_lock);
rc = real_clearenv();
D_RWLOCK_UNLOCK(&hook_env_lock);

return rc;
}
2 changes: 1 addition & 1 deletion src/gurt/tests/SConscript
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ def scons():

test_env = env.Clone()
test_env.require('mercury', 'uuid')
test_env.AppendUnique(LIBS=['pthread', 'cmocka', 'm'])
test_env.AppendUnique(LIBS=['pthread', 'cmocka', 'm', 'dl'])
test_env.AppendUnique(CXXFLAGS=['-std=c++0x'])
tests = []

Expand Down
2 changes: 1 addition & 1 deletion src/tests/ftest/cart/utest/SConscript
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ def scons():
test_env.require('mercury', 'uuid', 'cmocka')
# The test is checking that this feature works so disable the compile warnings for it.
test_env.AppendIfSupported(CCFLAGS=['-Wno-gnu-designator', '-Wno-missing-field-initializers'])
test_env.AppendUnique(LIBS=['pthread', 'm', 'yaml'])
test_env.AppendUnique(LIBS=['pthread', 'm', 'yaml', 'dl'])
test_env.AppendUnique(CXXFLAGS=['-std=c++0x'])
test_env.AppendUnique(LIBPATH=LIBPATH)
test_env.AppendUnique(RPATH_FULL=LIBPATH)
Expand Down