-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
/usr/bin/toolbox linked against glibc-2.32 doesn't run on older glibc #529
Comments
This looks like the container failed to start. Once you have attempted to
|
I did and this is the error message:
Here is with the
|
Yep, this is not good. Basically some library pulled in by the Toolbox code uses C code requiring the use of cgo. When |
Yes, as @HarryMichal mentioned, this is the problematic part:
It means that a We end up in this situation because we bind mount |
I grabbed the Fedora 33 binary, unpacked it and poked at it a bit.
Looks like there's a new implementation of |
The /usr/bin/toolbox binary is not only used to interact with toolbox containers and images from the host. It's also used as the entry point of the containers by bind mounting the binary from the host into the container. This means that the /usr/bin/toolbox binary on the host must also work inside the container, even if they have different operating systems. In the past, this worked perfectly well with the POSIX shell implementation because it got intepreted by whichever /bin/sh was available. The Go implementation also mostly worked so far because it's largely statically linked, with the notable exception of the standard C library. However, recently glibc-2.32, which is used by Fedora 33 onwards, added a new version of the pthread_sigmask symbol [1] as part of the libpthread removal project. This means that /usr/bin/toolbox binaries built against glibc-2.32 on newer Fedoras pick up the latest version of the symbol and fail to run against older glibcs in older Fedoras. One way to fix this is to disable the use of any C code from Go by using the CGO_ENABLED environment variable [2]. However, this can negatively impact packages like "os/user" [3] and "net" [4], where the more featureful glibc APIs will be replaced by more limited equivalents written only in Go. Instead, since glibc uses symbol versioning, it's better to tell the Go toolchain to avoid linking against any symbols from glibc-2.32. This was accomplished by a few linker tricks: * The GNU ld linker's --wrap flag was used when building the Go code to divert pthread_sigmask invocations from Go to another function called __wrap_pthread_sigmask. * A static library was added to provide this __wrap_pthread_sigmask function, which forwards calls to the actual pthread_sigmask API in glibc. This library itself was not linked with --wrap, and specifies the latest permissible version of the pthread_sigmask symbol from glibc for each architecture. Currently, the list of architectures covers the ones that Fedora builds for. * The Go cmd/link linker was switched to external mode [5]. This ensures that the final object file containing all the Go code gets linked to the standard C library and the wrapper static library by the GNU ld linker for the --wrap flag to kick in. Based on ideas from Ondřej Míchal. [1] glibc commit c6663fee4340291c https://sourceware.org/git/?p=glibc.git;a=commit;h=c6663fee4340291c [2] https://golang.org/cmd/cgo/ [3] https://golang.org/pkg/os/user/ [4] https://golang.org/pkg/net/ [5] https://golang.org/src/cmd/cgo/doc.go containers#529
The /usr/bin/toolbox binary is not only used to interact with toolbox containers and images from the host. It's also used as the entry point of the containers by bind mounting the binary from the host into the container. This means that the /usr/bin/toolbox binary on the host must also work inside the container, even if they have different operating systems. In the past, this worked perfectly well with the POSIX shell implementation because it got intepreted by whichever /bin/sh was available. The Go implementation also mostly worked so far because it's largely statically linked, with the notable exception of the standard C library. However, recently glibc-2.32, which is used by Fedora 33 onwards, added a new version of the pthread_sigmask symbol [1] as part of the libpthread removal project: $ objdump -T ./usr/bin/toolbox | grep GLIBC_2.32 0000000000000000 DO *UND* 0000000000000000 GLIBC_2.32 pthread_sigmask This means that /usr/bin/toolbox binaries built against glibc-2.32 on newer Fedoras pick up the latest version of the symbol and fail to run against older glibcs in older Fedoras. One way to fix this is to disable the use of any C code from Go by using the CGO_ENABLED environment variable [2]. However, this can negatively impact packages like "os/user" [3] and "net" [4], where the more featureful glibc APIs will be replaced by more limited equivalents written only in Go. Instead, since glibc uses symbol versioning, it's better to tell the Go toolchain to avoid linking against any symbols from glibc-2.32. This was accomplished by a few linker tricks: * The GNU ld linker's --wrap flag was used when building the Go code to divert pthread_sigmask invocations from Go to another function called __wrap_pthread_sigmask. * A static library was added to provide this __wrap_pthread_sigmask function, which forwards calls to the actual pthread_sigmask API in glibc. This library itself was not linked with --wrap, and specifies the latest permissible version of the pthread_sigmask symbol from glibc for each architecture. Currently, the list of architectures covers the ones that Fedora builds for. * The Go cmd/link linker was switched to external mode [5]. This ensures that the final object file containing all the Go code gets linked to the standard C library and the wrapper static library by the GNU ld linker for the --wrap flag to kick in. Based on ideas from Ondřej Míchal. [1] glibc commit c6663fee4340291c https://sourceware.org/git/?p=glibc.git;a=commit;h=c6663fee4340291c [2] https://golang.org/cmd/cgo/ [3] https://golang.org/pkg/os/user/ [4] https://golang.org/pkg/net/ [5] https://golang.org/src/cmd/cgo/doc.go containers#529
The /usr/bin/toolbox binary is not only used to interact with toolbox containers and images from the host. It's also used as the entry point of the containers by bind mounting the binary from the host into the container. This means that the /usr/bin/toolbox binary on the host must also work inside the container, even if they have different operating systems. In the past, this worked perfectly well with the POSIX shell implementation because it got intepreted by whichever /bin/sh was available. The Go implementation also mostly worked so far because it's largely statically linked, with the notable exception of the standard C library. However, recently glibc-2.32, which is used by Fedora 33 onwards, added a new version of the pthread_sigmask symbol [1] as part of the libpthread removal project: $ objdump -T /usr/bin/toolbox | grep GLIBC_2.32 0000000000000000 DO *UND* 0000000000000000 GLIBC_2.32 pthread_sigmask This means that /usr/bin/toolbox binaries built against glibc-2.32 on newer Fedoras pick up the latest version of the symbol and fail to run against older glibcs in older Fedoras. One way to fix this is to disable the use of any C code from Go by using the CGO_ENABLED environment variable [2]. However, this can negatively impact packages like "os/user" [3] and "net" [4], where the more featureful glibc APIs will be replaced by more limited equivalents written only in Go. Instead, since glibc uses symbol versioning, it's better to tell the Go toolchain to avoid linking against any symbols from glibc-2.32. This was accomplished by a few linker tricks: * The GNU ld linker's --wrap flag was used when building the Go code to divert pthread_sigmask invocations from Go to another function called __wrap_pthread_sigmask. * A static library was added to provide this __wrap_pthread_sigmask function, which forwards calls to the actual pthread_sigmask API in glibc. This library itself was not linked with --wrap, and specifies the latest permissible version of the pthread_sigmask symbol from glibc for each architecture. Currently, the list of architectures covers the ones that Fedora builds for. * The Go cmd/link linker was switched to external mode [5]. This ensures that the final object file containing all the Go code gets linked to the standard C library and the wrapper static library by the GNU ld linker for the --wrap flag to kick in. Based on ideas from Ondřej Míchal. [1] glibc commit c6663fee4340291c https://sourceware.org/git/?p=glibc.git;a=commit;h=c6663fee4340291c [2] https://golang.org/cmd/cgo/ [3] https://golang.org/pkg/os/user/ [4] https://golang.org/pkg/net/ [5] https://golang.org/src/cmd/cgo/doc.go containers#529
I found a way to tell the Go toolchain to avoid the new version of the Testing welcome. |
The /usr/bin/toolbox binary is not only used to interact with toolbox containers and images from the host. It's also used as the entry point of the containers by bind mounting the binary from the host into the container. This means that the /usr/bin/toolbox binary on the host must also work inside the container, even if they have different operating systems. In the past, this worked perfectly well with the POSIX shell implementation because it got intepreted by whichever /bin/sh was available. The Go implementation also mostly worked so far because it's largely statically linked, with the notable exception of the standard C library. However, recently glibc-2.32, which is used by Fedora 33 onwards, added a new version of the pthread_sigmask symbol [1] as part of the libpthread removal project: $ objdump -T /usr/bin/toolbox | grep GLIBC_2.32 0000000000000000 DO *UND* 0000000000000000 GLIBC_2.32 pthread_sigmask This means that /usr/bin/toolbox binaries built against glibc-2.32 on newer Fedoras pick up the latest version of the symbol and fail to run against older glibcs in older Fedoras. One way to fix this is to disable the use of any C code from Go by using the CGO_ENABLED environment variable [2]. However, this can negatively impact packages like "os/user" [3] and "net" [4], where the more featureful glibc APIs will be replaced by more limited equivalents written only in Go. Instead, since glibc uses symbol versioning, it's better to tell the Go toolchain to avoid linking against any symbols from glibc-2.32. This was accomplished by a few linker tricks: * The GNU ld linker's --wrap flag was used when building the Go code to divert pthread_sigmask invocations from Go to another function called __wrap_pthread_sigmask. * A static library was added to provide this __wrap_pthread_sigmask function, which forwards calls to the actual pthread_sigmask API in glibc. This library itself was not linked with --wrap, and specifies the latest permissible version of the pthread_sigmask symbol from glibc for each architecture. Currently, the list of architectures covers the ones that Fedora builds for. * The Go cmd/link linker was switched to external mode [5]. This ensures that the final object file containing all the Go code gets linked to the standard C library and the wrapper static library by the GNU ld linker for the --wrap flag to kick in. Based on ideas from Ondřej Míchal. [1] glibc commit c6663fee4340291c https://sourceware.org/git/?p=glibc.git;a=commit;h=c6663fee4340291c [2] https://golang.org/cmd/cgo/ [3] https://golang.org/pkg/os/user/ [4] https://golang.org/pkg/net/ [5] https://golang.org/src/cmd/cgo/doc.go containers#529
Closing. Please feel free to leave a comment on the PRs or here if you think that it's still broken. Thanks for the testing, by the way. Much appreciated! |
Fedora 32 reached End of Life on 25th May 2021: https://docs.fedoraproject.org/en-US/releases/eol/ That's quite old because right now Fedora 35 is nearing its End of Life. Since the tests are intended for Toolbx, not the Fedora infrastructure, it will be better to use a newer image, because images that are too old can get lost from registry.fedoraproject.org. The fedora-toolbox:34 image can be a drop-in replacement for the fedora-toolbox:32 image for the purposes of this test suite, and has the advantage of being newer. Note that fedora-toolbox:34 is also old enough to test that the toolbox binary runs against it's build-time ABI from the host, and not the Toolbx container's ABI, when it's invoked as the entry point of the container [1,2]. This is important because the subsequent commit will add a test to ensure that. [1] Commit 6063eb2 containers#821 [2] Commit 6ad9c63 containers#529 containers#1187
Commit ae43560 had added a test with a similar intention. When the test suite is run on a Fedora Rawhide host, it tests whether the containers for the two previous stable Fedora releases start or not. Fedora N-2 reaches End of Life four weeks after Fedora N is released. So, testing the containers for Fedora Rawhide and the two previous stable releases on a Fedora Rawhide host is a decent test of general backwards compatibility. However, as seen recently [1], this isn't enough to catch some known ABI compatibility issues [2,3]. These involve toolbox binaries built on hosts with newer toolchains that aren't meant to be run against containers with older runtimes. A targeted test is needed to defend against these scenarios. The fedora-toolbox:34 image has glibc-2.33, which is old enough to be unable to run binaries compiled on Fedora 35 with glibc-2.34 and newer. [1] containers#1180 [2] Commit 6063eb2 containers#821 [3] Commit 6ad9c63 containers#529 https://docs.fedoraproject.org/en-US/releases/ containers#1187
Commit ae43560 had added a test with a similar intention. When the test suite is run on a Fedora Rawhide host, it tests whether the containers for the two previous stable Fedora releases start or not. Fedora N-2 reaches End of Life four weeks after Fedora N is released. So, testing the containers for Fedora Rawhide and the two previous stable releases on a Fedora Rawhide host is a decent test of general backwards compatibility. However, as seen recently [1], this isn't enough to catch some known ABI compatibility issues [2,3]. These involve toolbox binaries built on hosts with newer toolchains that aren't meant to be run against containers with older runtimes. A targeted test is needed to defend against these scenarios. The fedora-toolbox:34 image has glibc-2.33, which is old enough to be unable to run binaries compiled on Fedora 35 with glibc-2.34 and newer. [1] containers#1180 [2] Commit 6063eb2 containers#821 [3] Commit 6ad9c63 containers#529 https://docs.fedoraproject.org/en-US/releases/ containers#1187
Commit ae43560 had added a test with a similar intention. When the test suite is run on a Fedora Rawhide host, it tests whether the containers for the two previous stable Fedora releases start or not. Fedora N-2 reaches End of Life 4 weeks after Fedora N is released [1]. So, testing the containers for Fedora Rawhide and the two previous stable releases on a Fedora Rawhide host is a decent test of general backwards compatibility. However, as seen recently [2], this isn't enough to catch some known ABI compatibility issues [3,4]. These involve toolbox binaries built on hosts with newer toolchains that aren't meant to be run against containers with older runtimes. A targeted test is needed to defend against these scenarios. The fedora-toolbox:34 image has glibc-2.33, which is old enough to be unable to run binaries compiled on Fedora 35 with glibc-2.34 and newer. [1] https://docs.fedoraproject.org/en-US/releases/ [2] containers#1180 [3] Commit 6063eb2 containers#821 [4] Commit 6ad9c63 containers#529 containers#1187
Describe the bug
Usually, you can run another Fedora release with toolbox by doing:
But at Fedora Rawhide you get the following error:
The full debug output:
Steps how to reproduce the behaviour
Expected behaviour
That toolbox would enter the container normally.
Actual behaviour
It fails to perform
enter
on the container.Output of
toolbox --version
(v0.0.90+)toolbox version 0.0.93
Toolbox package info (
rpm -q toolbox
)It was installed from the sources, no toolbox package installed.
Output of
podman version
Podman package info (
rpm -q podman
)podman-2.1.0-0.169.dev.git162625f.fc33.x86_64`
Info about your OS
I tested in a virtual machine with Vagrant. The OS is Fedora 33.
It was a fresh installation from today (August 14th) and with all the packages updated.
Additional context
I tried with the releases 31 and 32. The release 33 (the same that the host) was working fine. Well, apart for the bug: #523
Also, I tied the same at another VM with Fedora 32 and it worked fine. I tried at that F32 VM with the releases 29, 31, 32 and 33, with no problems.
I notice though, that the entry point PID, the one from the error, was different. At the Rawhide system the PID was always
0
, but at my system (Silverblue 32) and the VM (Fedora 32) were always a non-zero value. Something likePID=32612
and such.For example (inside Fedora 32 VM):
The text was updated successfully, but these errors were encountered: