Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

To determine GAP_SO sage.env looks for libgap.so but it should look for libgap.so* #33446

Closed
tornaria opened this issue Mar 2, 2022 · 64 comments · Fixed by #35094
Closed

To determine GAP_SO sage.env looks for libgap.so but it should look for libgap.so* #33446

tornaria opened this issue Mar 2, 2022 · 64 comments · Fixed by #35094

Comments

@tornaria
Copy link
Contributor

tornaria commented Mar 2, 2022

My system (void linux) only has

$ ls -l /usr/lib/libgap.so*
lrwxrwxrwx 1 root root      15 Jan 14 11:14 /usr/lib/libgap.so.0 -> libgap.so.0.0.0
-rwxr-xr-x 1 root root 2080920 Jan 14 11:14 /usr/lib/libgap.so.0.0.0

installed. Indeed, the symlink /usr/lib/libgap.so -> libgap.so.0.0.0 is shipped in gap-devel which is not needed to run sagemath at all.

This is because the implementation of the function sage.env._get_shared_lib_path() will look for libgap.so instead of libgap.so*. The fix is trivial:

--- a/src/sage/env.py
+++ b/src/sage/env.py
@@ -293,7 +293,7 @@ def _get_shared_lib_path(*libnames: str) -> Optional[str]:
             if sys.platform == 'darwin':
                 ext = 'dylib'
             else:
-                ext = 'so'
+                ext = 'so*'
 
             if SAGE_LOCAL:
                 search_directories.append(Path(SAGE_LOCAL) / 'lib')

Note that this function is only used once in sage.env to set GAP_SO.

Without the fix above my GAP_SO is set to None and I get failures like

sage: libgap.eval("2+2")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [1], in <module>
----> 1 libgap.eval("2+2")
...
TypeError: expected str, NoneType found

Depends on #34391

CC: @dimpase @orlitzky @williamstein

Component: packages: standard

Author: Gonzalo Tornaría

Branch/commit - see the corr. PR

Reviewer: Dima Pasechnik

Issue created by migration from https://trac.sagemath.org/ticket/33446

@tornaria tornaria added this to the sage-9.6 milestone Mar 2, 2022
@tornaria
Copy link
Contributor Author

tornaria commented Mar 2, 2022

Commit: 165854c

@tornaria
Copy link
Contributor Author

tornaria commented Mar 2, 2022

New commits:

165854csage.env.GAP_SO: look for `libgap.so*`

@tornaria
Copy link
Contributor Author

tornaria commented Mar 2, 2022

Author: Gonzalo Tornaría

@tornaria
Copy link
Contributor Author

tornaria commented Mar 2, 2022

Branch: u/tornaria/33446/libgap.so

@mkoeppe
Copy link
Contributor

mkoeppe commented Mar 2, 2022

comment:2

I'd guess there'a potential for a regression here on Linux systems that have multiple versions of the shared library installed.

@tornaria
Copy link
Contributor Author

tornaria commented Mar 2, 2022

comment:3

Replying to @mkoeppe:

I'd guess there'a potential for a regression here on Linux systems that have multiple versions of the shared library installed.

I don't understand why libgap needs to be dlopen(), since it is already linked in. The relevant tickets for that hack are #26930 and #22626 comment:424, but it's not clear to me how to reproduce the claimed failure without the hack.

In any case if there are multiple versions of the shared library installed, nothing guarantees that the correct one is loaded. It seems like a "more correct" way to find the shared library that is actually already loaded would be based on something like:

sage: sage.libs.gap.libgap.__loader__.path
'/opt/sage/sage-git/develop/pkgs/sagemath-standard/build/lib.linux-x86_64-3.10/sage/libs/gap/libgap.cpython-310-x86_64-linux-gnu.so'
sage: os.system("ldd %s | grep libgap" % sage.libs.gap.libgap.__loader__.path)
	libgap.so.0 => /usr/lib/libgap.so.0 (0x00007fd15418f000)
0

I'm not sure how portable is ldd (an alternative is objdump -p but that requires binutils which is not otherwise required to run sagemath).


One possibility is to skip the hack altogether when GAP_SO is unset. That seems to work ok, but I'm not quite convinced. The hack is either needed or not needed, and making it depend on some circumstantial detail (e.g. whether gap-devel is installed in the system or not) seems wrong.

@orlitzky
Copy link
Contributor

orlitzky commented Mar 2, 2022

comment:5

Why isn't libgap.so installed with the rest of the package? You need it for $CC ... -lgap to work, so I think it should be part of the "normal" installation.

@mkoeppe
Copy link
Contributor

mkoeppe commented Mar 2, 2022

comment:6

On some distributions, the .so symlink is only packaged in a -devel package.
@tornaria is building distribution packages for use at runtime - these should not depend on the -devel package.

@orlitzky
Copy link
Contributor

orlitzky commented Mar 2, 2022

comment:7

Replying to @mkoeppe:

On some distributions, the .so symlink is only packaged in a -devel package.
@tornaria is building distribution packages for use at runtime - these should not depend on the -devel package.

I understand the problem, but the libtool version number is an implementation detail here. The user-facing library is libgap.so. While it's technically true that you can use gap itself without libgap.so because the symlink is followed at link-time and "libgap.so.0.0.0" is recorded into whatever you're linking... if the GAP package is meant to provide libgap in a usable way, it doesn't.

The bottom line is that it's a lot of trouble to save however much space a symlink takes up. I'd change the distro package if it were up to me. Typically, a -devel package includes additional headers, but the libraries themselves are included in the non-devel package. Trying to work around that in sage by at implementation details could cause other unexpected problems that aren't worth the space of a symlink on a few machines.

@mkoeppe
Copy link
Contributor

mkoeppe commented Mar 2, 2022

comment:8

Replying to @orlitzky:

The user-facing library is libgap.so.

Only at build time, not at runtime.

@mkoeppe
Copy link
Contributor

mkoeppe commented Mar 2, 2022

comment:9

Replying to @orlitzky:

Trying to work around that in sage by at implementation details could cause other unexpected problems that aren't worth the space of a symlink on a few machines.

Beyond the scope of our project to change packaging practices of major distributions

@orlitzky
Copy link
Contributor

orlitzky commented Mar 2, 2022

comment:10

Replying to @mkoeppe:

Replying to @orlitzky:

The user-facing library is libgap.so.

Only at build time, not at runtime.

There is some subtler distinction to be made here, but my point is that the distro package in question isn't usable by anyone who wants to build software that links to libgap. I think that should be fixed, independent of this issue. And then as a side effect, this problem just happens to disappear.

@orlitzky
Copy link
Contributor

orlitzky commented Mar 2, 2022

comment:11

Replying to @mkoeppe:

Replying to @orlitzky:

Trying to work around that in sage by at implementation details could cause other unexpected problems that aren't worth the space of a symlink on a few machines.

Beyond the scope of our project to change packaging practices of major distributions

Given that most major distros have at least one developer with a presence here... if there's a solution that's better for everyone and requires only a minor tweak to a distro package rather than piling more hacks onto what's already a hack within sage, I think we should be thankful for our good luck.

@mkoeppe
Copy link
Contributor

mkoeppe commented Mar 2, 2022

comment:12

Replying to @orlitzky:

Replying to @mkoeppe:

Replying to @orlitzky:

The user-facing library is libgap.so.

Only at build time, not at runtime.

There is some subtler distinction to be made here, but my point is that the distro package in question isn't usable by anyone who wants to build software that links to libgap.

I think you are missing that the point is that the runtime package of Sage does not have to depend on the dev package of gap, only on the runtime package gap.

@orlitzky
Copy link
Contributor

orlitzky commented Mar 2, 2022

comment:13

Replying to @mkoeppe:

I think you are missing that the point is that the runtime package of Sage does not have to depend on the dev package of gap, only on the runtime package gap.

I'm not. I'm saying that the runtime package should include the symlink that makes libgap accessible via the expected name. That's all that's needed to fix this.

@mkoeppe
Copy link
Contributor

mkoeppe commented Mar 2, 2022

comment:14

Replying to @orlitzky:

While it's technically true that you can use gap itself without libgap.so because the symlink is followed at link-time and "libgap.so.0.0.0" is recorded into whatever you're linking...

What you are dismissing here as an implementation detail is really at the core of how binary distributions work: Multiple versions of a shared library can be present on a system, and binaries shipped by package link to the correct version of the shared library; there cannot be a notion of "current version" at runtime.

@orlitzky
Copy link
Contributor

orlitzky commented Mar 2, 2022

comment:15

Replying to @mkoeppe:

What you are dismissing here as an implementation detail is really at the core of how binary distributions work: Multiple versions of a shared library can be present on a system, and binaries shipped by package link to the correct version of the shared library; there cannot be a notion of "current version" at runtime.

That's... just how it works. It has nothing in particular to do with binary distributions, and is also why globbing for libgap.so* won't work.

If you want my real answer, we should stop dlopen()ing libraries and remove all hard-coded library paths from sage. Short of that you're always going to get either unreliable hacks or required rebuilds. In this case the sage package would have to be rebuilt if a new version of libgap comes with incompatible changes. The gentoo package format has a way to automate that and I'd guess that binary distro infrastructure does too. It kinda sucks but it's the simple and reliable way to work around our existing problems, unless someone wants to tackle all of the dlopen() insanity.

@orlitzky
Copy link
Contributor

orlitzky commented Mar 2, 2022

comment:16

I also see that the gap library is versioned upstream, so if the distro package is installing it as libgap.so.0.0.0 on purpose, that makes any theorizing about multiple versions extra pointless.

@tornaria
Copy link
Contributor Author

tornaria commented Mar 2, 2022

comment:17

In the end, the only "correct" way to proceed is to figure out the actual soname that the gap extension module in sagemath was linked with, namely:

$ objdump -p /usr/lib/python3.10/site-packages/sage/libs/gap/libgap.so | grep NEEDED.*libgap
  NEEDED               libgap.so.0

This is telling you that you must look for libgap.so.0, not for libgap.so since that (a) may not be installed (b) may point to a different version of the library.

The path to the extension module itself can be had with

sage: sage.libs.gap.libgap.__loader__.path
'/usr/lib/python3.10/site-packages/sage/libs/gap/libgap.so'

This should be portable, but objdump may not be available at runtime (part of binutils). Using ldd may be better:

$ ldd /usr/lib/python3.10/site-packages/sage/libs/gap/libgap.so | grep libgap
	libgap.so.0 => /usr/lib/libgap.so.0 (0x00007f964dd21000)

Of course it would be much better to just avoid using dlopen(), assuming the bug in gap leading to this ugly hack has been fixed upstream (see markuspf/gap#1 for the explanation).

@mkoeppe
Copy link
Contributor

mkoeppe commented Mar 2, 2022

comment:18

Replying to @tornaria:

the only "correct" way to proceed is to figure out the actual soname that the gap extension module in sagemath was linked with

+1

@tornaria
Copy link
Contributor Author

tornaria commented Mar 2, 2022

comment:19

Replying to @orlitzky:

I also see that the gap library is versioned upstream, so if the distro package is installing it as libgap.so.0.0.0 on purpose, that makes any theorizing about multiple versions extra pointless.

Upstream uses libgap.so.0 as the soname. In theory if they ever change the library in a backwards-incompatible way they should bump the soname. Let's say they release gap 5 using libgap.so.1 as the soname. Then I may have both libraries installed (libgap.so.0 and libgap.so.1) but at most one libgap.so symlink. And regardless of what this links points to (or whether it exists) the correct library to dlopen() is the same that is in the NEEDED section in the gap extension modules -- i.e. the version that sagemath was linked against.

I also believe that dlopen() does not need a full path, in fact you can use dlopen() with just the soname which I presume would search for the library in the same paths as the dynamic linker does.

@tornaria
Copy link
Contributor Author

tornaria commented Mar 2, 2022

comment:20

In fact one way to get the soname for a library may be using ctypes.util.find_library() but I'm not sure it would do the right thing when multiple libraries are available:

sage: import ctypes.util
sage: ctypes.util.find_library("gap")
'libgap.so.0.0.0'
sage: import sage.misc.compat
sage: sage.misc.compat.find_library("gap")
'libgap.so.0.0.0'

@kiwifb
Copy link
Member

kiwifb commented Mar 2, 2022

comment:21

That would still need to be somewhat parsed https://docs.python.org/3/library/ctypes.html#finding-shared-libraries - on linux it uses whatever is available to find an answer and will look into LD_LIBRARY_PATH, although the wording suggests it is done only if no results is returned without taking it into account, which seems dumb to me. But on OS X you get a fully qualified name as an answer, probably because installed_name values are paths themselves. I don't remember the code but surely we also do something on OS X?

@orlitzky
Copy link
Contributor

orlitzky commented Mar 2, 2022

comment:22

Replying to @tornaria:

I also believe that dlopen() does not need a full path,

On OSX it does. We ran into that problem with libSingular.

@orlitzky
Copy link
Contributor

orlitzky commented Mar 2, 2022

comment:23

Replying to @tornaria:

In fact one way to get the soname for a library may be using ctypes.util.find_library() but I'm not sure it would do the right thing when multiple libraries are available:

It doesn't, it chooses the most recent one.

@tornaria
Copy link
Contributor Author

comment:48

I rebased everything on 9.8.beta1.

I am still unsure what to do about gap_workspace_file() and whether the last commit is ok.

I think it would be better to use stuff from KERNEL_INFO() as mentioned in comment:40. The problem with that is that gap_workspace_file() is called on sage startup and initializing gap is slow. The solution could be to delay calling gap_workspace_file() until it is actually used (at which point gap or libgap will be initialized anyway).

For the short run, what is here seems better (with or without the last commit). Right now I still have to patch stuff for void linux since we don't ship /usr/lib/libgap.so with runtime packages (and it is a pain since everything will work ok if gap-devel happens to be installed, which is the case when building, so the failure tends to be missed).

@dimpase
Copy link
Member

dimpase commented Nov 1, 2022

comment:49

Could you rebase this on GAP 4.12 - the update has been positively reviewed in #34391 ?

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 12, 2022

Branch pushed to git repo; I updated commit sha1. This was a forced push. Last 10 new commits:

72d57e0don't use SAGE_LOCAL/share/gap, switch to GAP's "make install"
16c2ccbSearch for files in both lib/gap and share/gap
3abfbd0Create lib/gap/bin on install
73eaa6agap: do not directly dlopen() the gap library
8a6cfc4singular: do not directly dlopen() the singular library
7c0f667singular: remove LIBSINGULAR_PATH, no longer needed
fb52609Revert "src/sage/interfaces/gap_workspace.py: Use hash of GAP_SO to disambiguate the workspace file, not SAGE_LOCAL"
f1d189cgap: remove GAP_SO, no longer needed
46e48d6gap_workspace_file: include hostname and gap version
1e5dd37gap_workspace_file: use GAP_LIB_DIR and GAP_SHARE_DIR

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 12, 2022

Changed commit from 31e3fc4 to 1e5dd37

@tornaria
Copy link
Contributor Author

comment:51

Rebased on top of #34391. I added a commit so sysinfo.gap is searched both in GAP_LIB_DIR and in GAP_SHARE_DIR. Also: use the content of those variables instead of SAGE_LOCAL.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Dec 5, 2022

Branch pushed to git repo; I updated commit sha1. This was a forced push. Last 10 new commits:

93c1becensure make gap-clean still works
71c9fe3with SAGE_CHECK=yes, one needs io installed to prevent hangs
29383bebackport GAP PR 5235
b2a9b2egap: do not directly dlopen() the gap library
564c5casingular: do not directly dlopen() the singular library
6e1abd3singular: remove LIBSINGULAR_PATH, no longer needed
ec2476bRevert "src/sage/interfaces/gap_workspace.py: Use hash of GAP_SO to disambiguate the workspace file, not SAGE_LOCAL"
eec410bgap: remove GAP_SO, no longer needed
4768b99gap_workspace_file: include hostname and gap version
126b0degap_workspace_file: use GAP_LIB_DIR and GAP_SHARE_DIR

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Dec 5, 2022

Changed commit from 1e5dd37 to 126b0de

@dimpase
Copy link
Member

dimpase commented Dec 7, 2022

Dependencies: #34391

@dimpase
Copy link
Member

dimpase commented Dec 7, 2022

comment:55

let's get #34391 in first - it's a bit hard to review now.

@tornaria
Copy link
Contributor Author

tornaria commented Dec 8, 2022

comment:56

Replying to Dima Pasechnik:

let's get #34391 in first - it's a bit hard to review now.

Sure, but just for the record: the head of #34391 is

  • ​29383be backport GAP PR 5235
    and my branch here is linearly rebased on top of that.

I could squash some of the commits together if that makes it easier to review (ec2476b + 4768b99 + 126b0de are part of the same change, in a sense). That would leave 5 clearly defined commits.

@dimpase
Copy link
Member

dimpase commented Dec 9, 2022

Reviewer: Dima Pasechnik

@dimpase
Copy link
Member

dimpase commented Dec 9, 2022

comment:57

looks good - also tested on macOS

@vbraun
Copy link
Member

vbraun commented Feb 12, 2023

The dependency hasn't been merged yet, so please create a github PR for this ticket

dimpase pushed a commit to dimpase/sage that referenced this issue Feb 12, 2023
This needs the soname (as in sage.env.GAP_SO) which has issues for
system gap as explained in sagemath#33446.

Instead we dlopen() the extension module sage.libs.gap.util which,
having a link time dependency to libgap, will indirectly dlopen() it.

For the record: by the time we run dlopen() the libgap should be already
loaded. The purpose of doing it is to change mode to RTLD_GLOBAL so that
symbols in libgap are placed in the global symbol table. This is
required to compiled load gap packages.

An easy test that this is working ok is:

    sage: libgap.LoadPackage("io")
    true

This requires optional spkg `gap_packages` to be installed.
dimpase pushed a commit to dimpase/sage that referenced this issue Feb 12, 2023
…isambiguate the workspace file, not SAGE_LOCAL"

This reverts commit a801e6d.

See sagemath#33446.
dimpase pushed a commit to dimpase/sage that referenced this issue Feb 12, 2023
This needs the soname (as in sage.env.GAP_SO) which has issues for
system gap as explained in sagemath#33446.

Instead we dlopen() the extension module sage.libs.gap.util which,
having a link time dependency to libgap, will indirectly dlopen() it.

For the record: by the time we run dlopen() the libgap should be already
loaded. The purpose of doing it is to change mode to RTLD_GLOBAL so that
symbols in libgap are placed in the global symbol table. This is
required to compiled load gap packages.

An easy test that this is working ok is:

    sage: libgap.LoadPackage("io")
    true

This requires optional spkg `gap_packages` to be installed.
dimpase pushed a commit to dimpase/sage that referenced this issue Feb 12, 2023
…isambiguate the workspace file, not SAGE_LOCAL"

This reverts commit a801e6d.

See sagemath#33446.
tornaria added a commit to tornaria/sage that referenced this issue Feb 20, 2023
This needs the soname (as in sage.env.GAP_SO) which has issues for
system gap as explained in sagemath#33446.

Instead we dlopen() the extension module sage.libs.gap.util which,
having a link time dependency to libgap, will indirectly dlopen() it.

For the record: by the time we run dlopen() the libgap should be already
loaded. The purpose of doing it is to change mode to RTLD_GLOBAL so that
symbols in libgap are placed in the global symbol table. This is
required to compiled load gap packages.

An easy test that this is working ok is:

    sage: libgap.LoadPackage("io")
    true

This requires optional spkg `gap_packages` to be installed.
tornaria added a commit to tornaria/sage that referenced this issue Feb 20, 2023
…isambiguate the workspace file, not SAGE_LOCAL"

This reverts commit a801e6d.

See sagemath#33446.
@vbraun vbraun closed this as completed in dc933bd Mar 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants