docs: update interactive environments documentation

fix #1647
SwissDataScienceCenter · Nov 25, 2020 · 6a0c028 · 6a0c028
1 parent 9971e6c
commit 6a0c028
Show file tree

Hide file tree

Showing 4 changed files with 143 additions and 69 deletions.
diff --git a/docs/spelling_wordlist.txt b/docs/spelling_wordlist.txt
@@ -149,6 +149,7 @@ Ubuntu
 ui
 untracked
 untracked
+url
 username
 versioned
 versioning

diff --git a/docs/user/interactive_basics.rst b/docs/user/interactive_basics.rst
@@ -15,40 +15,38 @@ of code before combining everything into a (reproducible) workflow.
 You can run JupyterLab or RStudio within a project independently from RenkuLab,
 but RenkuLab offers the following advantages:
 
-* environments hosted in the cloud with a configurable amount of resources
-  (memory, CPU, and sometimes GPU)
+* Environments hosted in the cloud with a configurable amount of resources
+  (memory, CPU, and sometimes GPU).
 
-* environments are defined using Docker, so they can be shared and reproducibly
-  re-created
+* Environments are defined using Docker, so they can be shared and reproducibly re-created.
 
-* auto-saving of work back to RenkuLab, so you can recover in the event of a
-  crash
+* Auto-saving of work back to RenkuLab, so you can recover in the event of a crash.
 
-* a git client pre-configured with your credentials to easily push your changes
-  back to the server
+* A git client pre-configured with your credentials to easily push your changes
+  back to the server.
 
-* the functionality provided by the renku-python_ command-line interface (CLI)
-  is automatically available
+* The functionality provided by the renku-python_ command-line interface (CLI)
+  is automatically available.
 
 
 What's in my Interactive Environment?
 -------------------------------------
 
-* your project, which is cloned into the environment on startup
+* Your project, which is cloned into the environment on startup.
 
-* your data (if the option ``Automatically fetch LFS data`` is selected)
-  files that are stored in git LFS*)
+* Your data files (if the option ``Automatically fetch LFS data`` is selected)
+  that are stored in git LFS*.
 
-* all the software required to launch the environment and common tools for
-  working with code (``git``, ``git LFS``, ``vim``, etc.)
+* All the software required to launch the environment and common tools for
+  working with code (``git``, ``git LFS``, ``vim``, etc.).
 
-* any dependencies you specified via conda (requirements.txt), using
+* Any dependencies you specified via conda (``environment.yml``), using
   language-specific dependency-management facilities (``requirements.txt``,
-  ``install.R``, etc.) or installed in the ``Dockerfile``
+  ``install.R``, etc.) or installed in the ``Dockerfile``.
 
-* the renku command-line interface renku-python_.
+* The renku command-line interface renku-python_.
 
-* the amount of CPUs, memory, and (possibly) GPUs that you configured before launch
+* The amount of CPUs, memory, and (possibly) GPUs that you configured before launch
 
 For adding or changing software installed into your project's interactive environment,
 check out :ref:`customizing`
@@ -78,54 +76,58 @@ configuration options.
 +------------------------------+-------------------------------------------------------------------------------------------+
 | Option                       | Description                                                                               |
 +==============================+===========================================================================================+
-| branch                       | default master, but if you're doing work on another branch, switch!                       |
+| Branch                       | Default is ``master``. You can switch if you are working on another branch                |
 +------------------------------+-------------------------------------------------------------------------------------------+
-| commit                       | default latest, but you can launch the environment from an earlier commit;                |
-|                              |                                                                                           |
-|                              | also useful if your latest commit's build failed (see below).                             |
+| Commit                       | Default is the latest, but you can launch the environment from an earlier commit. This is |
+|                              | especially useful if your latest commit's build failed (see below) or you have unsaved    |
+|                              | work that was automatically recovered.                                                    |
 +------------------------------+-------------------------------------------------------------------------------------------+
-| environment                  | ``lab``: JupyterLab; ``rstudio``: RStudio; if you're using a python template,             |
-|                              |                                                                                           |
-|                              | the ``rstudio`` endpoint will not work.                                                   |
+| Default Image                | This provides information about the Docker image used by the Interactive Environment.     |
+|                              | When it fails, you can try to rebuild it, or you can check the GitLab job logs.           |
+|                              | An image can also be pinned so that new commits will not require a new image              |
+|                              | each time.                                                                                |
 +------------------------------+-------------------------------------------------------------------------------------------+
-| # CPUs                       | the number of CPUs available; resources are shared, so please select the lowest amount    |
-|                              | that will work for your use case.                                                         |
+| Default environment          | Default is ``/lab``, it loads the JupyterLab interface. If you are working with ``R``,    |
+|                              | you may want to use ``/rstudio`` for RStudio. Mind that the corresponding packages need   |
+|                              | to be installed in the image. If you're using a python template, the ``rstudio`` endpoint |
+|                              | will not work.                                                                            |
 +------------------------------+-------------------------------------------------------------------------------------------+
-| memory                       | the amount of RAM available; resources are shared, so please select the lowest amount     |
-|                              | that will work for your use case.                                                         |
+| Number of CPUs               | The number of CPUs available, or the quota. Resources are shared, so please select the    |
+|                              | lowest amount that will work for your use case. Usually, the default value works well.    |
 +------------------------------+-------------------------------------------------------------------------------------------+
-| # GPUs                       | the number of GPUs available; You might have to wait for GPUs to free up in               |
-|                              |                                                                                           |
-|                              | order to be able to launch an environment.                                                |
+| Amount of Memory             | The amount of RAM available. Resources are shared, so please select the lowest amount     |
+|                              | that will work for your use case. Usually, the default value works well.                  |
 +------------------------------+-------------------------------------------------------------------------------------------+
-| Automatically fetch LFS data | Leave off by default. If you find that workflows                                          |
-|                              | you used to be able to run have stopped working,                                          |
-|                              | check the contents of the file(s) -- if plain text and contains                           |
-|                              | strings that are not your data, run ``renku storage pull <filepath>``                     |
-|                              | to get the relevant files, or ``git lfs pull`` to get all of the                          |
-|                              | files at once.                                                                            |
+| Number of GPUs               | The number of GPUs available. If you can't select any number, no GPUs are available in    |
+|                              | RenkuLab deployment you are using. If you request any, you might need to wait for GPUs    |
+|                              | to free up in order to be able to launch an environment.                                  |
++------------------------------+-------------------------------------------------------------------------------------------+
+| Automatically fetch LFS data | Default is off. All the lfs data will be automatically fetched in if turned on. This is   |
+|                              | convenient, but it may considerably slow down the start time if the project contains a    |
+|                              | lot of data. Refer to :ref:`Data in Renku <data>` for further information                 |
 +------------------------------+-------------------------------------------------------------------------------------------+
 
 
 What if the Docker image is not available?
 ------------------------------------------
 
 Interactive environments are backed by Docker images. When launching a new
-interactive environment a container is created from the image that matches the
+interactive environment, a container is created from the image that matches the
 selected ``branch`` and ``commit``.
 
 A GitLab's CI/CD pipeline automatically builds a new image using the project's
 ``Dockerfile`` when any of the following happens:
 
-  * creating of a project
-  * forking a project (in which the new build happens for the fork)
-  * pushing changes to the project
+  * Creating of a project.
+  * Forking a project (in which the new build happens for the fork).
+  * Pushing changes to the project.
 
-(This is defined in the project's ``.gitlab-ci.yml`` file.)
+This is defined in the project's :ref:`.gitlab-ci.yml file <gitlab_ci_yml>`. If the project
+references a pinned image, the UI will not check for the image availability - that is
+usually provided by the project's maintainer and it doesn't change at every new commit.
 
-It can sometimes take some time to build an image for various reasons, but if
-you've just created the project on RenkuLab from one of the templates it should
-take less than  a minute.
+It may take a long time to build an image for various reasons, but if you've just created the
+project on RenkuLab from one of the templates, it generally takes less than a minute or two.
 
 
 The Docker image is still building
@@ -144,31 +146,34 @@ The Docker image build failed
 If this happens, it's best to click the link to view the logs on GitLab so you
 can see what happened. Here are some common reasons for build failure:
 
-* Software installation failure
+Software installation failure
+*****************************
 
-**problem** You added a new software library to ``requirements.txt``, ``environment.yml``,
+**Problem:** You added a new software library to ``requirements.txt``, ``environment.yml``,
 or ``install.R``, but something was wrong with the installation (e.g. typo in
 the name, extra dependencies required for the library but unavailable).
 
-**how to fix this**
+**How to fix this:**
 You can use the GitLab editor or clone your project locally to fix the installation,
 possibly by adding the extra dependencies it asks for into the ``Dockerfile``
 (the commented out section in the file explains how to do this). As an alternative,
 you can start an interactive environment from an earlier commit.
 
-**how to avoid this** First try installing into your running interactive environment,
+**How to avoid this:** First try installing into your running interactive environment,
 e.g. by running ``pip install -r requirements.txt`` in the terminal on JupyterLab.
 You might not have needed to install extra dependencies when installing on your
 local machine, but the operating system (OS) defined in the ``Dockerfile`` has
 minimal dependencies to keep it lightweight.
 
-* The build timed out
+The build timed out
+*******************
 
 By default, image builds are configured to time out after an hour. If your build
 takes longer than that, you might want to check out the section on :ref:`customizing`
 interactive environments before increasing the timeout.
 
-* Your project could not be cloned
+Your project could not be cloned
+********************************
 
 If you accidentally added 100s of MBs or GBs of data to your repo and didn't
 specify that it should be stored in git LFS, it might take too long to clone. In

diff --git a/docs/user/interactive_customizing.rst b/docs/user/interactive_customizing.rst
@@ -3,9 +3,9 @@
 Customizing interactive environments
 ====================================
 
-Very quickly you will want to make changes to the default configuration of your
-interactive sessions. The default environments we provide are pretty bare-bones
-so if you want to have easy access to your preferred packages, some simple steps
+Very soon, you will want to make changes to the default configuration of your
+interactive sessions. The default environments we provide are pretty bare-bones.
+If you want to have easy access to your preferred packages, some simple steps
 at the start of your project will get you on the way quickly.
 
 
@@ -14,14 +14,18 @@ Important files
 
 The launch is enabled by the content in the following files in your project:
 
-* language-specific files like ``requirements.txt`` or ``install.R``
-
 * ``Dockerfile``: defines the type of interactive environment and other software
   installed in the environment, including the ``renku`` command-line installation.
 
 * ``.gitlab-ci.yml``: controls the docker build of the image based on the project's
   ``Dockerfile``.
 
+* ``requirements.txt`` or ``install.R``: language-specific files controlling the
+  libraries.
+
+* ``.renku/renku.ini``: renku project configurations containing a
+  ``[renku "interactive"]`` section.
+
 The most basic modifications are installations of additional packages. This can be
 done automatically for Python and R projects if you add the packages you want
 to ``requirements.txt`` and ``install.R`` respectively.
@@ -31,26 +35,32 @@ Dockerfile structure
 --------------------
 
 The project's ``Dockerfile`` lives in the top level of the project directory. In
-the default ``Dockerfile`` provided in the template, the first line is a ``FROM``
-statement that specifies a `versioned base docker image <https://github.com/SwissDataScienceCenter/renku-jupyter>`_.
+the default ``Dockerfile`` provided in the template, the first line is a
+``RENKU_BASE_IMAGE`` argument used to feed the following ``FROM`` instruction.
+It specifies a
+`versioned base docker image <https://github.com/SwissDataScienceCenter/renku-jupyter>`_.
 We add new versions periodically, but the heart of it is the set of installations
 of jupyterlab/rstudio, git, and renku::
 
-  FROM renku/singleuser:0.3.5-renku0.5.2
+  ARG RENKU_BASE_IMAGE=renku/renkulab-py:3.7-0.7.3
 
-  # or, for RStudio in the build
+  # or, for RStudio
 
-  FROM renku/singleuser-r:0.3.5-renku0.5.2
+  ARG RENKU_BASE_IMAGE=renku/renkulab-r:4.0.0-0.7.3
 
 The next two statements install user-specified libraries from ``environment.yml``
 and ``requirements.txt``::
 
   # install the python dependencies
   COPY requirements.txt environment.yml /tmp/
   RUN conda env update -q -f /tmp/environment.yml && \
-  /opt/conda/bin/pip install -r /tmp/requirements.txt && \
-  conda clean -y --all && \
-  conda env export -n "root"
+    /opt/conda/bin/pip install -r /tmp/requirements.txt && \
+    conda clean -y --all && \
+    conda env export -n "root"
+
+Then we specify the renku version to be installed through ``pipx``::
+
+  ARG RENKU_VERSION=0.12.1
 
 You can add to this ``Dockerfile`` in any way you'd like.
 
@@ -62,7 +72,7 @@ Dockerfile development
 Before we get into modifying Dockerfiles, if you want to know how to update
 the base version of your renkulab image, see `Upgrading Renku <upgrading_renku>`_.
 
-If you're going to be making simple modifications to the ``Dockerfile`` (i.e. changing
+If you're going to make simple modifications to the ``Dockerfile`` (i.e. changing
 the base Docker image version number), you can use the following steps to update
 and re-build the image:
 
@@ -73,8 +83,8 @@ and re-build the image:
 #. When you're satisfied with the edits, scroll down and write a meaningful **commit message** (you'll thank yourself later).
 #. Click the green **Commit changes** button.
 
-You may find the [official docker documentation](https://docs.docker.com/engine/reference/builder/) useful
-during this process.
+You may find the `official docker documentation <https://docs.docker.com/engine/reference/builder/>`_
+useful during this process.
 
 Now you have committed the changes to your ``Dockerfile``. Since you have made a commit,
 the CI/CD pipeline will kick off (pre-configured for you as a ``renkulab-runner``
@@ -164,6 +174,58 @@ these base ``Dockerfile`` s and add the ``renku``, ``git``, and ``jupyter``
 parts to another base image that you might have.
 
 
+Renku project configurations
+----------------------------
+
+When starting a new Interactive Environment, most of the options can be manually
+changed by the user. Depending on the specific RenkuLab deployment, you can select
+more RAM, a higher CPU quota, etc.
+
+Your project may even include a package with an advanced UI (like
+`Streamlit <https://renku.discourse.group/t/how-to-deploy-streamlit-in-renku/169>`_)
+and you probably want to choose it as default.
+
+It's possible to set a default value for all these options using the project
+configurations stored in the ``.renku/renku.ini`` file.
+Once you do that, each time a user tries to start a new environment, those options will
+be pre-selected.
+
+.. note::
+
+  Manually modifying the ``renku.ini`` file is not recommended.
+  You can use the
+  `renku config command <https://renku-python.readthedocs.io/en/latest/commands.html#module-renku.cli.config>`_
+  form an interactive environment.
+
+    renku config set interactive.default_url "/tree"
+
+  We are working on adding a user friendly solution to set default options on
+  the project's settings page.
+
+**What are the specific options?**
+
+You can find a comprehensive list of options :ref:`on this page <renku_ini>`. Most commonly,
+you may want to change the ``default_url`` or set a specific ``image``.
+
+The first case is useful when you prefer to show a different default UI, like the standard
+Jupyter interface ``/tree``, or when you need support for a different interface,
+like R studio ``/rstudio`` or  ``/streamlit`` (not included in the standard Python template).
+
+The ``image`` is useful when you settle on a Docker image and you don't need to change it
+anymore. The benefit is particularly evident when building a new image takes a lot of time
+(e.g. you added big packages) or when you expect the project to be used by a lot of people
+over a short period of time (e.g. you use it in a presentation or a lecture).
+
+Even if it's common to start the environment with the default values, setting a default value
+doesn't prevent a user from changing it.
+
+.. note::
+
+  Mind that not all the RenkuLab deployments have the same set of options or allow to choose
+  the same values. If no GPUs are available, setting the default number to ``1`` can't work.
+  Should this be the case, a warning will show before starting a new environment.
+
+
 Getting Help
 ------------