Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faulty r-reticulate conda environment #20

Closed
igordot opened this issue Feb 23, 2022 · 10 comments
Closed

faulty r-reticulate conda environment #20

igordot opened this issue Feb 23, 2022 · 10 comments

Comments

@igordot
Copy link

igordot commented Feb 23, 2022

There is an odd behavior that I encountered. If I try to use leiden as the first reticulate package, a duplicated conda environment ends up getting created. When I call library(leiden), everything seems normal.

conda environment r-reticulate installed
...
python modules igraph and leidenalg installed

Then, when I call reticulate::conda_list():

          name
1         base
2 r-reticulate
                                                               python
1                   /usr/share/miniconda/envs/r-reticulate/bin/python
2 /usr/share/miniconda/envs/r-reticulate/envs/r-reticulate/bin/python

There are actual two r-reticulate environments detected, one inside the other and one of them is considered base. However, they are not complete and calling reticulate::py_module_available("leidenalg") returns FALSE.

This is the output of conda env list outside R:

# conda environments:
#
base                  *  /usr/share/miniconda
r-reticulate             /usr/share/miniconda/envs/r-reticulate
                         /usr/share/miniconda/envs/r-reticulate/envs/r-reticulate

I was able to recreate the behavior using GitHub Actions. You can check the output here (technically, there are no errors, so it's considered a successful run): https://github.com/igordot/leiden/runs/5296171878?check_suite_focus=true

If I set up r-reticulate environment before using leiden, everything seems fine. That might be why most people wouldn't encounter it. It seems something goes wrong when the package is trying to set up the environment.

@TomKellyGenetics
Copy link
Owner

Thank you for reporting this. This is unexpected. Of course I already had a conda environment installed on my machine before adding this feature to later versions.

Note that the .onAttach function requires that a conda environment named "r-reticulate" must be installed. It attempts to install missing dependencies there.

onAttach <- function(libname, pkgname) {
    if(!reticulate::py_available()){
        tryCatch({
            if(!("r-reticulate" %in% reticulate::conda_list()$name)){

As you've noted this is for convenience only and the package can be used if a conda environment is already set up.

This subroutine appears not to work due to different naming schemes between reticulate in R and conda in the command-line. I will investigate this to patch in a future release. Since some install steps are OS-specific, are you able to share details on which system you encountered the error?

Thanks for testing it on GH Actions, that is really helpful.

@TomKellyGenetics
Copy link
Owner

Perhaps further complicating the issue. My system (Mac OS) has multiple environments with the same name and no "base" environment.

> reticulate::conda_list()
           name                                                      python
2         conda        /Users/tom/Library/r-miniconda/envs/conda/bin/python
7  r-reticulate /Users/tom/Library/r-miniconda/envs/r-reticulate/bin/python
11 r-reticulate          /Users/tom/miniconda3/envs/r-reticulate/bin/python
conda env list
# conda environments:
base                  *  /Users/tom/Library/r-miniconda
conda                    /Users/tom/Library/r-miniconda/envs/conda
r-reticulate             /Users/tom/Library/r-miniconda/envs/r-reticulate
                         /Users/tom/miniconda3/envs/r-reticulate

You're right, reticulate seems to be doing something strange here so the .onLoad or .onAttach functions may not be working correctly.

@TomKellyGenetics
Copy link
Owner

I've looked into this further and found that the default conda environment for reticulate seems to be updated from "r-reticulate" to "r-miniconda". Which version of reticulate are you using?

It seems to work without duplicate environments on the new version using "r-miniconda" as a base image. So potential solutions are:

  • update the .onLoad function to use "r-miniconda" if available
  • update the DESCRIPTION to specific a minimum version of reticulate as a dependency to ensure the new environment is used
  • update the documentation describing the installation process recommending pre-configured conda environments

@igordot
Copy link
Author

igordot commented Apr 25, 2022

I was using reticulate 1.24 (the latest). I assume that's the case for GitHub Actions as well.

It looks like it's still using "r-reticulate" by default, though:
https://github.com/rstudio/reticulate/blob/256e3bc440a78a8f289b214ca715e65ce6b1e43d/R/miniconda.R#L66

@TomKellyGenetics
Copy link
Owner

Thanks for the quick response. Sorry it's taken a long time to get back to you. Other updates to the package are necessary so I am considering whether it is possible to fix this as well.

@igordot
Copy link
Author

igordot commented Apr 25, 2022

No problem. This is a relatively obscure error, so shouldn't be high priority. It seemed like it may have a simple fix.

TomKellyGenetics added a commit that referenced this issue Apr 25, 2022
… by reticulate in R

resolves conflicting environment names in conda and reticulate #20
TomKellyGenetics added a commit that referenced this issue Apr 25, 2022
… by reticulate in R

resolves conflicting environment names in conda and reticulate #20
TomKellyGenetics added a commit that referenced this issue Apr 25, 2022
… by reticulate in R

resolves conflicting environment names in conda and reticulate #20
TomKellyGenetics added a commit that referenced this issue Apr 25, 2022
if it is used by reticulate in R
resolves conflicting environment names in conda and reticulate #20
TomKellyGenetics added a commit that referenced this issue Apr 25, 2022
if it is used by reticulate in R
resolves conflicting environment names in conda and reticulate #20
@TomKellyGenetics
Copy link
Owner

Thanks for the feedback. I agree it's better to have it fixed but is more complicated than I originally expected. I tried to reproduce the error on a new machine without reticulate installed before and encountered this.

Preparing transaction: done

Verifying transaction: done

Executing transaction: done

conda environment r-reticulate installed

 
CondaValueError: The target prefix is the base prefix. Aborting.

Unable to install python modules igraph and leidenalg

run in terminal:

conda install -n r-reticulate -c conda-forge vtraag python-igraph pandas umap learn

python modules igraph and leidenalg installed

I think I have a solution so I am testing installing it from this branch (the dev branch is a major update #1):
https://github.com/TomKellyGenetics/leiden/tree/test-conda-setup

devtools::install_github("TomKellyGenetics/leiden", ref = "test-conda-setup")

I am able to install this version without errors on a new system.

@TomKellyGenetics
Copy link
Owner

Leiden v0.3.10 which should resolve this has been submitted to CRAN.

@igordot
Copy link
Author

igordot commented Apr 26, 2022

That was fast!

@TomKellyGenetics
Copy link
Owner

The CRAN testing system has installed Python so I had some issues with checks invoking the install which writes to disk to create a conda environment.

I've modified it so that it will prompt the user for consent to setting up a conda environment if it is not available and it will only run in interactive sessions (not in Rscript calls or R CMD CHECK). This version was resubmitted to CRAN and accepted this today. It is now available to installed from source and Windows/Mac binaries are building now.

TomKellyGenetics added a commit that referenced this issue May 9, 2022
… by reticulate in R

resolves conflicting environment names in conda and reticulate #20
TomKellyGenetics added a commit that referenced this issue May 9, 2022
if it is used by reticulate in R
resolves conflicting environment names in conda and reticulate #20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants