-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do name mangling on individual symbols? #79
Comments
Right, basically the idea is to solve the ELF namespace collision problem once and for all by finding all the symbols exported by our vendored libraries and giving them unique names, and then finding all the imports of those symbols and patching them up to use the new unique names. Basically this is one of those solutions that requires some ridiculous hoop-jumping, but in some sense it's the only really reliable solution, and it offends me that we can't do it. (It's just rewriting some lists of strings stored in an admittedly-rather-complicated data structure! who's in charge around here anyway?!) I spent like... half an hour looking at this, anyway, and it isn't obviously untenable. The main obstacle that previous attempts ran into is that if you modify the size of the There's also this thing, which might already be doing most of what we want, not sure: http://mit.edu/~jfc/src/sym-rename.c CC'ing @geofft because somehow I have the impression that this is the kind of mad science heroism that might appeal to him. Could be wrong. |
It turns out that this is also creates somewhat nasty problems for pypy: https://bitbucket.org/pypy/pypy/issues/2617/pypy-binary-is-linked-to-too-much-stuff |
The other thing that comes to mind for this: if we mangle the symbols, then won't we have to go back into the Python code that uses them and change the references to point to the mangled symbols? That seems like it would be very challenging if not impossible to do right, especially when we're dealing with CFFI and the like. |
@ehashman I think Python code that accesses symbols by name at runtime – ctypes, or cffi-in-abi-mode – is also opening the library by name at runtime, and thus already has this problem due to auditwheel's renaming of things? It would be nice if we had a better way for these kinds of wrappers to work with auditwheel – I'm kind of surprised we don't get more complaints about that actually! But I don't think this changes idea itself would change it. |
I think name mangling is not the final solution. The reason is, some library actually does not allow multiple copy in same process, e.g, they use some sort of file lock with same name, or anything considered global state. In my mind, final solution is to make the c shared library dependencies first class citizen in pypi. Then we can do proper dependency resolution, which opens the door that many libraries can use same c lib. Then what auditwheel should just modify the rpath. |
@shuoli84 do you have specific examples of the kinds of problems you describe? Packaging C libraries as first class citizens is straying into becoming a general-purpose packaging system, which there already plenty of other tools doing. Realistically, there are always going to be some limits on what's possible. So I think it's more valuable to discuss specifics like "library X doesn't work because..." rather than generalities. |
Global state in general is certainly a problem, and there could indeed be advantages to making C libraries more first-class citizens. (There are some old notes on that here.) But it's not relevant to this issue, because even if we did have C libraries as first class citizens, it wouldn't help with the uwsgi situation: the problem here is that uwsgi is unilaterally injecting symbols into distributed binaries that aren't compatible with those symbols. And there's no way to distribute binaries that are compatible with the injected symbols, because different installations of uwsgi are injecting different versions of those symbols, and it's impossible to be compatible with all of them at once. So we should probably keep this issue specifically about avoiding symbol clashes in auditwheel's vendored libraries, and split any discussion of packaging C libraries off into another thread (perhaps at discuss.python.org). |
Name mangling and shipping dependencies in wheel, though works most of time, is kinda risky. And I didn't find any warning message in document or readme. If it breaks, it introduced hard to track bugs. E.g, corrupt data files. Ref: https://ericsink.com/entries/multiple_sqlite_problem.html. User should know what they are doing, but auditwheel's repair is so easy to use, and provided by pypa, which may make this THE recommended approach. If a third-party library author convert the library to use this approach, app dev may not aware of this. @njsmith the link you provides is actually what I am looking for. Any follow up? There isn't much discussion in the mailing thread (I googled it, may miss something). |
@shuoli84 figuring out how to make the pynativelib idea work on macOS turned out to be way more complicated than I anticipated, and then by the time I'd finally done that I'd run out of energy for working on the project, and no-one else has stepped up. So it is doable, but there's no-one currently working on it. |
auditwheel uses patchelf to rename vendored libraries with unique names. This works wonderfully until something links against a library with the same symbol names and loads them before we get there. This scenario occurs when using
uwsgi
withcryptography
's new manylinux1 wheels. Sinceuwsgi
links against libssl/libcrypto itself we're seeing segfaults where it's calling between two different versions of the library (the oneuwsgi
loads globally and the onecryptography
has vendored). You can see more discussion (and a better explanation from @njsmith) here: pyca/cryptography#3804 (comment)One potential solution that was put forth would be to make it so patchelf is capable of name mangling individual symbols so that even in a scenario like the one above there's no issue with namespace collision. This sounds very challenging but as more and more manylinux1 wheels start being shipped it may be the right long term solution.
The text was updated successfully, but these errors were encountered: