Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: starter: overwrite glibc's internal tid cache on clone() #2837

Closed
wants to merge 2 commits into from

Commits on Apr 19, 2024

  1. fix: starter: overwrite glibc's internal tid cache on clone()

    Adapted from: opencontainers/runc#4247
    
    Execution of a container using a PID namespace can fail on certain
    versions of glibc when Singularity is built with Go 1.22.
    
    This is due to Go 1.22 performing calls using pthread_self which,
    from glibc 2.25, is not updated for the current TID on clone.
    
    Fixes sylabs#2677
    
    -----
    
    Original runc explanation:
    
    Since glibc 2.25, the thread-local cache of the current TID is no
    longer updated in the child when calling clone(2). This results in
    very unfortunate behaviour when Go does pthread calls using
    pthread_self(), which has the wrong TID stored.
    
    The "simple" solution is to forcefully overwrite this cached value.
    Unfortunately (and unsurprisingly), the layout of "struct pthread"
    is strictly private and can change without warning.
    
    Luckily, glibc (currently) uses CLONE_CHILD_CLEARTID for all forks
    (with the child_tid set to the cached &PTHREAD_SELF->tid), meaning
    that as long as runc is using glibc, when "runc init" is spawned
    the child process will have a pointer directly to the cached value
    we want to change. With CONFIG_CHECKPOINT_RESTORE=y kernels on
    Linux 3.5 and later, we can simply use prctl(PR_GET_TID_ADDRESS).
    For older kernels we need to memory scan the TLS structure
    (pthread_self() returns a pointer to the start of the structure
    so we can "just" scan it for a field containing the current TID
    and assume that it is the correct field).
    
    Obviously this is all very horrific, and if you are reading this
    in the future, it almost certainly has caused some horrific bug
    that I did not forsee. Sorry about that. As far as I can tell,
    there is no other workable solution that doesn't also depend on the
    CLONE_CHILD_CLEARTID behaviour of glibc in some way. We cannot
    "just" do a re-exec after clone(2) for security reasons.
    
    Fixes opencontainers/runc#4233
    Signed-off-by: Aleksa Sarai [email protected]
    dtrudg committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    780436e View commit details
    Browse the repository at this point in the history
  2. chore: bump Go to 1.22.2 in CI

    dtrudg committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    2e0393f View commit details
    Browse the repository at this point in the history