Installing PYG with conda is very buggy #4386

andrei-rusu · 2022-03-31T13:08:23Z

😵 Describe the installation problem

This is a report concerning an environment built from scratch from an environment.yml file.
First thing I tried was installing a PyTorch 1.11 environment, which caused the following bug with CUDA: rusty1s/pytorch_scatter#248.
Second try was with a PyTorch 1.10 environment which resulted in the following error: #3593. I was able to temporarily fix this by uninstalling torch_spline_conv but this was just a temporary fix. The very next moment I tried installing a package with CUDA, pyg==2.0.4 reported a conflict, which couldn't be resolved by anything I tried. The conda resolver tried in vain to fix this issue, reporting countless of versioning issues related to PYG, including again the GLIBC version problem (which apparently IPython also has a problem with, albeit being silent before the full conda check is performed). Worth mentioning that this is a remote server so I cannot change the GLIBC version anyway.

This is now getting frustrating and I'd hate to have to resort to installing from scratch the environment every time I need a new package. Something is clearly broken with the PYG versioning on conda, since uninstalling pyg fixed everything...
Below is the environment.yml in question:

name: graph
channels:
  - pytorch
  - pyg
  - conda-forge
  - defaults
dependencies:
  - python=3.9
  - conda
  - pytorch=1.10.1
  - torchvision==0.11.2
  - cudatoolkit=11.3
  - pyg
  - networkx
  - numpy
  - matplotlib
  - plotly
  - pandas
  - tqdm
  - dill
  - scikit-learn
  - jupyterlab
  - torchvision
  - pytorch-lightning
  - neptune-client

Environment

PyG version: 2.0.4
PyTorch version: 1.10
OS: Red Hat Enterprise Linux 7"
Python version: 3.9.12
CUDA/cuDNN version: 11.3
How you installed PyTorch and PyG (conda, pip, source): conda
Any other relevant information (e.g., version of torch-scatter):

The text was updated successfully, but these errors were encountered:

NucciTheBoss · 2022-03-31T14:52:12Z

Hmm. Maybe an update to the conda package is in order.

A GLIBC version error is encountered typically when you try to use an executable compiled for a newer Linux-based OS (i.e. RHEL8) and then try to use it on an older OS (RHEL7). I have encountered the GLIBC version error a few times with certain conda packages on my RHEL7 cluster. Drawback to conda using precompiled packages. You usually only have two ways around this:

Install from source.
Use the conda package inside a container whose base image comes with the necessary GLIBC version.

Since installing from source is not the preferable option, do you have Singularity installed on your remote server?

andrei-rusu · 2022-03-31T16:21:58Z

Thanks for the reply! The server uses Slurm, and browsing through the modules I found one that does have Singularity, but I never used that so I am not sure how that would help. I tried searching for a module that actually comes preloaded with another GLIBC version, but no luck there (I found 2021 versions of GCC, but no newer GLIBC...).

NucciTheBoss · 2022-03-31T19:09:27Z

Yeah... you won't find a module that updates the GLIBC version on your remote server. If the Linux kernel is the brain of a computer, GLIBC is the spine. Swapping out versions of GLIBC on a Linux system can cause the entire system to break since many applications rely on it. Here's a graphic from WIkipedia that shows how it plays into the Linux system:

Singularity would help by containerizing PyG so that it can still run on your system even though it has an older GLIBC version. A recent pull request of mine (#4376) fixed issues with building the Singularity image, however, I am still working on updating the image to a newer version of PyG. If you have sudo privileges on a Linux system, you can build the image yourself:

git clone https://github.com/pyg-team/pytorch_geometric.git
sudo singularity build pyg.sif pytorch_geometric/docker/singularity

This would be your best workaround while the conda package and container recipes are updated.

andrei-rusu · 2022-03-31T21:02:13Z

I see, thank you! Unfortunately I do not have sudo there, so I think I'll stick to using the "dirty" pip installs for now. But I'll keep this issue opened as it would be nice to have conda behave nicely on remote servers (a lot of which utilize outdated GLIBC unfortunately).

rusty1s · 2022-04-01T11:28:38Z

As far as I see, a simple fix would be to exclude the pytorch-spline-conv dependency from pyg. Would that work for you?

andrei-rusu · 2022-04-01T12:21:45Z

Yes, maybe that would work. I did not exactly understand the context of the pyg conflicts that conda was complaining about, as the environment was successfully set up via the environment.yml, only for it to cause conflicts upon uninstalling pytorch-spline-conv (step which I've done in order to be able to import torch_geometric.nn without the GLIBC error). Maybe excluding that dependency altogether fixes up the reported conflicts, as I suspect my questionable pip uninstall was sufficient for running my code, but not suitable for conda to quit complaining.

rusty1s · 2022-04-02T18:53:02Z

This is fixed in #4400.

rperera12 · 2022-04-29T23:59:58Z

Hi,
I am facing the same issue with Slurm cluster which has GLIBC 2.17.
I am able to import torch_geometric fine with cuda with no problem, until importing MessagePassing (from torch_geometric.nn.conv import MessagePassing) where I get the GLIBC error.
This might be a silly question, but is there a way to import MessagePassing without needing pytorch-spline-conv?

rusty1s · 2022-04-30T00:16:01Z

torch-spline-conv Is an optional dependency. If you do not need this operator, you can simply choose to not install it.

rperera12 · 2022-04-30T00:18:54Z

Thank you!
I am using one of the PyG models that needs MessagePassing,
I am currently importing it using:

from torch_geometric.nn.conv import MessagePassing

This line is the one that causes the GLIBC error since it calls torch-spline-conv at some point.
is there a way to import MessagePassing without needing pytorch-spline-conv?

rusty1s · 2022-05-02T07:03:48Z

Can you try to run

pip uninstall torch-spline-conv

rperera12 · 2022-05-02T13:57:27Z

Perfect!
This did the trick, I can now use MessagePassing on slurm cluster without any issues.
Thank you so much!

andrei-rusu added the installation label Mar 31, 2022

rusty1s mentioned this issue Apr 2, 2022

Optional pytorch-spline-conv in conda #4400

Merged

rusty1s closed this as completed Apr 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Installing PYG with conda is very buggy #4386

Installing PYG with conda is very buggy #4386

andrei-rusu commented Mar 31, 2022 •

edited

Loading

NucciTheBoss commented Mar 31, 2022

andrei-rusu commented Mar 31, 2022

NucciTheBoss commented Mar 31, 2022

andrei-rusu commented Mar 31, 2022

rusty1s commented Apr 1, 2022

andrei-rusu commented Apr 1, 2022 •

edited

Loading

rusty1s commented Apr 2, 2022 •

edited

Loading

rperera12 commented Apr 29, 2022

rusty1s commented Apr 30, 2022

rperera12 commented Apr 30, 2022

rusty1s commented May 2, 2022

rperera12 commented May 2, 2022

Installing PYG with conda is very buggy #4386

Installing PYG with conda is very buggy #4386

Comments

andrei-rusu commented Mar 31, 2022 • edited Loading

😵 Describe the installation problem

Environment

NucciTheBoss commented Mar 31, 2022

andrei-rusu commented Mar 31, 2022

NucciTheBoss commented Mar 31, 2022

andrei-rusu commented Mar 31, 2022

rusty1s commented Apr 1, 2022

andrei-rusu commented Apr 1, 2022 • edited Loading

rusty1s commented Apr 2, 2022 • edited Loading

rperera12 commented Apr 29, 2022

rusty1s commented Apr 30, 2022

rperera12 commented Apr 30, 2022

rusty1s commented May 2, 2022

rperera12 commented May 2, 2022

andrei-rusu commented Mar 31, 2022 •

edited

Loading

andrei-rusu commented Apr 1, 2022 •

edited

Loading

rusty1s commented Apr 2, 2022 •

edited

Loading