Skip to content

Commit

Permalink
Separating python and r kernels into their own conda envs (EBI-Metage…
Browse files Browse the repository at this point in the history
…nomics#23)

* [wip] separating python and r kernels into otheir wn conda envs

* keep R conda env activated for all R dependency steps

* updates jl extensions to target ES2018

* separates r and python kernels into different conda envs

* adds missing matplotlib dependency

* adds static rendering of R notebooks

* updates tests to use separated conda env names

* allows test retries even if setup handler fails
  • Loading branch information
SandyRogers authored Mar 9, 2023
1 parent 88d0972 commit 5809641
Show file tree
Hide file tree
Showing 24 changed files with 290 additions and 60 deletions.
8 changes: 7 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,10 @@ tests/*.png
_site
_freeze
/.quarto/
*.pyc
*.pyc
.task
site_libs
src/docs/*.html
src/*.html
src/notebooks/**/*.html
src/*-listing.json
24 changes: 14 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,12 +28,17 @@ Between the [base image](https://jupyter-docker-stacks.readthedocs.io/en/latest/
and the extra requirements (`dependecies/*`), the Docker contains all the libraries we need.

### To add a new notebook
#### Python
```bash
TITLE='My New Notebook' AUTHOR='My Name' task add-py-notebook
```
This copies and fills in a template notebook stub file with a standard header, into `src/notebooks/Python Examples`.

(TODO: `R` version).
#### R
```bash
TITLE='My New Notebook' AUTHOR='My Name' task add-r-notebook
```
This copies and fills in a template notebook stub file with a standard header, into `src/notebooks/R Examples`.

### To open the notebooks server in edit mode
```bash
Expand All @@ -50,7 +55,7 @@ It should be localhost port 8888, with a random token.

When you're finished editing, use normal `git add` and `git commit` to contribute your changes.

For info, ("jovyan" is always the user for these Jupyter Docker images.)
For info, "jovyan" is always the user for these Jupyter Docker images. Jovyan as in jovian (a being from the planet Jupiter), but from Jupyter!

#### Guidance for authoring notebooks
- Notebooks should be complete examples, that can be run with zero code changes needed
Expand All @@ -66,12 +71,7 @@ Add commands to this to include other datasets in the cache.
The cache is zipped and checked into the repo for faster population during builds (`dependencies/mgnify-cache.tgz`), since it rarely changes.
To check in an updated version of the cache...
```bash
docker run -it -v $PWD/dependencies:/opt/dependencies mgnify-nb-dev /bin/bash
cd /opt/dependencies
rm mgnify-cache.tgz
Rscript populate-mgnifyr-cache.R
tar -czf mgnify-cache.tgz /home/jovyan/.mgnify_cache
exit
task update-mgnifyr-cache
git add depdencies/mgnify-cache.tgz
```

Expand Down Expand Up @@ -99,7 +99,11 @@ task render-static
This builds a docker image tagged as `notebooks-static`, runs Quarto inside it, executes all cells of the notebooks,
and renders the completed notebooks to the `_site` folder (which is mounted from your host machine into Docker).

You can then open the generated HTML, or use
You can [browse](http://localhost:4444) the generated HTML with
```bash
task serve-static
```
Or use
```bash
task preview-static
```
Expand Down Expand Up @@ -177,7 +181,7 @@ This is in the `mgnify_jupyter_lab_ui` folder.

## Testing
A small integration test suite is written using Jest-Puppetteer.
You need to have built or pulled the docker/Dockerfile (tagegd as `quay.io/microbiome-informatics/emg-notebooks.dev`), and have Shiny Proxy downloaded first.
You need to have built or pulled the docker/Dockerfile (tagged as `quay.io/microbiome-informatics/emg-notebooks.dev`), and have Shiny Proxy downloaded first.
The test suite runs Shiny Proxy, and makes sure Jupyter Lab opens, the deep-linking works, and variable insertion works in R and Python.

```bash
Expand Down
51 changes: 47 additions & 4 deletions Taskfile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,24 @@ tasks:
- sh: test -n "{{.AUTHOR}}"
msg: "AUTHOR parameter missing. Set: TITLE='Notebook Name' AUTHOR='Your Name' task add-py-notebook"

add-r-notebook:
summary: |
Creates a new R-language notebook stub from a template.
Set the TITLE and AUTHOR variables, which are used to name
the notebook file and set them in the notebook front matter.
Usage: TITLE='Notebook Name' AUTHOR='Your Name' task add-r-notebook
cmds:
- cp src/templates/r.ipynb "src/notebooks/R Examples/{{.TITLE}}.ipynb"
- sed -i '' 's/::NBTITLE::/{{.TITLE}}/g' "src/notebooks/R Examples/{{.TITLE}}.ipynb"
- sed -i '' 's/::NBAUTHOR::/{{.AUTHOR}}/g' "src/notebooks/R Examples/{{.TITLE}}.ipynb"
preconditions:
- sh: test -n "{{.TITLE}}"
msg: "TITLE parameter missing. Set: TITLE='Notebook Name' AUTHOR='Your Name' task add-r-notebook"
- sh: test -n "{{.AUTHOR}}"
msg: "AUTHOR parameter missing. Set: TITLE='Notebook Name' AUTHOR='Your Name' task add-r-notebook"

edit-notebooks:
summary: |
Opens Jupyter Lab (via Docker) in edit mode – with the notebooks source bound to this repository
Expand All @@ -34,8 +52,9 @@ tasks:
The built image is tagged as `notebooks-static`.
cmds:
- docker build -f docker/docs.Dockerfile -t notebooks-static .
status:
- docker image inspect notebooks-static
sources:
- docker/docs.Dockerfile
- docker/Dockerfile

render-static:
summary: |
Expand All @@ -45,11 +64,35 @@ tasks:
cmds:
- docker run -it -v $PWD:/opt/repo -w /opt/repo notebooks-static render --execute
deps: [build-static-docker]
sources:
- src/**/*

serve-static:
summary: |
Serves the rendered notebooks and documentation as a static website
This serves the contents of ./_site
cmds:
- echo "Browse to http://127.0.0.1:4444"
- docker run -it -v $PWD:/opt/repo -w /opt/repo/_site -p 4444:4444 --entrypoint python notebooks-static -m http.server 4444
deps: [render-static]

preview-static:
summary: |
Runs, renders, and serveces the notebooks as a static website, watching for changes
Runs, renders, and serves the notebooks as a static website, watching for changes
cmds:
- echo 'When the rendering is finished, the static preview of notebooks will be at http://127.0.0.1:4444 ...'
- docker run -it -v $PWD:/opt/repo -w /opt/repo -p 4444:4444 notebooks-static preview --no-browser --port 4444 --host 0.0.0.0
deps: [build-static-docker]
deps: [build-static-docker]

update-mgnifyr-cache:
summary: |
Copies the MgnifyR Cache directory from the docker image into the host repository.
Run this if you change the populate-mgnifyr-cache.R file.
This cache zip is just used to speed up subsequent docker builds by not calling the MGnify API so much.
It writes a zip of the cache to dependencies/mgnify-cache.tgz
cmds:
- docker run -it -v $PWD/dependencies:/opt/dependencies -w /opt/dependencies quay.io/microbiome-informatics/emg-notebooks.dev:latest /bin/bash zip-mgnifyr-cache.sh
2 changes: 1 addition & 1 deletion _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ project:
- src/docs/*.md
- src/notebooks_list.qmd
- src/notebooks/Python Examples/*.ipynb
# - src/notebooks/R Examples/*.ipynb
- src/notebooks/R Examples/*.ipynb

execute:
freeze: auto
Expand Down
Binary file modified dependencies/mgnify-cache.tgz
Binary file not shown.
7 changes: 6 additions & 1 deletion dependencies/populate-mgnifyr-cache.R
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,11 @@ mg <- mgnify_client(usecache = T, cache_dir = '/home/jovyan/.mgnify_cache')
tara_all = mgnify_analyses_from_studies(mg, 'MGYS00002008')
metadata = mgnify_get_analyses_metadata(mg, tara_all)

# To generate phyloseq object
## To generate phyloseq object
clean_acc=c('MGYA00590456','MGYA00590543','MGYA00593110','MGYA00590477','MGYA00593125','MGYA00590448','MGYA00590508','MGYA00589025','MGYA00593139','MGYA00593220','MGYA00590525','MGYA00590534','MGYA00593112','MGYA00590498','MGYA00590535','MGYA00593223','MGYA00590480','MGYA00590496','MGYA00590523','MGYA00590444','MGYA00590517','MGYA00590575','MGYA00589039','MGYA00590574','MGYA00590474','MGYA00590554','MGYA00590469','MGYA00590471','MGYA00590522','MGYA00593141','MGYA00589049','MGYA00593123','MGYA00590564','MGYA00589024','MGYA00590572','MGYA00590545','MGYA00590518','MGYA00593126','MGYA00590526','MGYA00590500','MGYA00590570','MGYA00590520','MGYA00590443','MGYA00589013','MGYA00590449','MGYA00589021','MGYA00593130','MGYA00589047','MGYA00589042','MGYA00590577','MGYA00590470','MGYA00590473','MGYA00593216','MGYA00590562','MGYA00590464','MGYA00590484','MGYA00590462','MGYA00590565','MGYA00590439','MGYA00590472','MGYA00590566','MGYA00590552','MGYA00590485','MGYA00593133','MGYA00590544','MGYA00590455','MGYA00590437','MGYA00589044')
ps = mgnify_get_analyses_phyloseq(mg, clean_acc)

# For the "Fetch Analaysis Metadata for a Study" notebook
analyses_accessions <- mgnify_analyses_from_studies(mg, 'MGYS00005292')
analyses_metadata_df <- mgnify_get_analyses_metadata(mg, head(analyses_accessions, 10))
analyses_ps <- mgnify_get_analyses_phyloseq(mg, analyses_metadata_df$analysis_accession, tax_SU = "SSU")
13 changes: 13 additions & 0 deletions dependencies/py-environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# https://jupyter-docker-stacks.readthedocs.io/en/latest/using/selecting.html#jupyter-datascience-notebook
name: mgnify-py-env
channels:
- conda-forge
dependencies:
- python==3.11
- pip
- pip:
- jsonapi-client==0.9.9
- pandas==1.5.3
- numpy==1.24.2
- ipykernel==6.21.3
- matplotlib==3.7.1
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# https://jupyter-docker-stacks.readthedocs.io/en/latest/using/selecting.html#jupyter-datascience-notebook
name: mgnify-r-env
channels:
- conda-forge
dependencies:
Expand All @@ -8,6 +9,7 @@ dependencies:
- bioconda::bioconductor-siamcat=2.2.0
- conda-forge::r-reshape2=1.4.4
- conda-forge::r-vegan=2.6_4
- pip
- pip:
- jsonapi-client==0.9.9
- conda-forge::r-devtools=2.4.5
- conda-forge::r-irkernel=1.3.2
- conda-forge::r-tidyverse=2.0.0
- conda-forge::r-ggplot2=3.4.1
3 changes: 3 additions & 0 deletions dependencies/zip-mgnifyr-cache.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#!/bin/bash
rm mgnify-cache.tgz
tar -czf mgnify-cache.tgz --absolute-names /home/jovyan/.mgnify_cache
20 changes: 14 additions & 6 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,20 +1,25 @@
FROM jupyter/datascience-notebook:r-4.2.2
FROM jupyter/datascience-notebook:r-4.2.2@sha256:c8c75096f0efe1d7cad07ec97ecdf5c6328a119bf6295269c45065c90b0dde2a
USER root
ENV CHOWN_HOME_OPTS='-R'
ENV CHOWN_HOME='yes'

# Install Python/R dependencies
COPY dependencies/environment.yml /tmp/environment.yml
COPY dependencies/r-environment.yml /tmp/r-environment.yml
COPY dependencies/py-environment.yml /tmp/py-environment.yml
COPY dependencies/dependencies.R /tmp/dependencies.R
RUN mamba env update -n base --file /tmp/environment.yml
RUN Rscript /tmp/dependencies.R
RUN mamba install -y nb_conda_kernels
RUN mamba env create -f /tmp/r-environment.yml
RUN mamba env create -f /tmp/py-environment.yml

SHELL ["conda", "run", "-n", "mgnify-r-env", "/bin/bash", "-c"]
RUN Rscript /tmp/dependencies.R
# Place / check / populate MGnifyR cache for example used in studies
# Zipped cache should be up to date in repo, but run populate to be sure nothing is missing
COPY dependencies/mgnify-cache.tgz /tmp/mgnify-cache.tgz
RUN tar -xzf /tmp/mgnify-cache.tgz -C /
COPY dependencies/populate-mgnifyr-cache.R /tmp/populate-mgnifyr-cache.R
RUN Rscript /tmp/populate-mgnifyr-cache.R
SHELL ["/bin/bash", "-c"]

# Install JupyterLab extension to handle query parameter > env vars in ShinyProxy
COPY shiny_proxy_jlab_query_parms /tmp/shiny_proxy_jlab_query_parms
Expand All @@ -30,5 +35,8 @@ RUN jlpm cache clean
# Clean tmp
RUN rm -rf /tmp/*

COPY shiny-proxy/custom.js /home/jovyan/.jupyter/custom/custom.js
COPY src/notebooks /home/jovyan/mgnify-examples
RUN jupyter kernelspec remove -y julia-1.8
COPY jupyter_config/custom.js /home/jovyan/.jupyter/custom/custom.js
COPY jupyter_config/jupyter_config.json /home/jovyan/.jupyter/jupyter_config.json
COPY src/notebooks /home/jovyan/mgnify-examples
RUN jupyter labextension disable "@jupyterlab/apputils-extension:announcements"
5 changes: 4 additions & 1 deletion docker/docs.Dockerfile
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
FROM quay.io/microbiome-informatics/emg-notebooks.dev

ARG QUARTO_VERSION="1.3.142"
ARG QUARTO_VERSION="1.3.250"
WORKDIR /tmp
RUN wget https://github.com/quarto-dev/quarto-cli/releases/download/v${QUARTO_VERSION}/quarto-${QUARTO_VERSION}-linux-amd64.deb
RUN dpkg -i quarto-${QUARTO_VERSION}-linux-amd64.deb
# install conda kernels to standard jupyter kernels
RUN python -m nb_conda_kernels list

ENTRYPOINT ["quarto"]
File renamed without changes.
8 changes: 8 additions & 0 deletions jupyter_config/jupyter_config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"CondaKernelSpecManager":
{
"conda_only": true,
"name_format": "{0} ({1})",
"kernelspec_path": "--user"
}
}
2 changes: 1 addition & 1 deletion mgnify_jupyter_lab_ui/tsconfig.json
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
"rootDir": "src",
"strict": true,
"strictNullChecks": true,
"target": "es2017",
"target": "ES2018",
"types": ["jest"]
},
"include": ["src/*"]
Expand Down
2 changes: 1 addition & 1 deletion shiny_proxy_jlab_query_parms/tsconfig.json
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
"rootDir": "src",
"strict": true,
"strictNullChecks": true,
"target": "es2017",
"target": "ES2018",
"types": []
},
"include": ["src/*"]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,9 @@
"cell_type": "code",
"execution_count": null,
"id": "a3dbe76a-7843-4b77-ae16-710af5ae7f56",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from lib.variable_utils import get_variable_from_link_or_input\n",
Expand All @@ -88,7 +90,9 @@
"cell_type": "code",
"execution_count": null,
"id": "4b574761-0257-4659-bb14-cfd93191a5b0",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from jsonapi_client import Session\n",
Expand All @@ -112,9 +116,9 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"display_name": "Python (mgnify-py-env)",
"language": "python",
"name": "python3"
"name": "conda-env-mgnify-py-env-py"
},
"language_info": {
"codemirror_mode": {
Expand All @@ -126,7 +130,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
"version": "3.11.0"
}
},
"nbformat": 4,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,9 @@
"cell_type": "code",
"execution_count": null,
"id": "2c424193-2f45-408d-805a-1fe140fbb0d2",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
Expand All @@ -147,9 +149,9 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"display_name": "Python (mgnify-py-env)",
"language": "python",
"name": "python3"
"name": "conda-env-mgnify-py-env-py"
},
"language_info": {
"codemirror_mode": {
Expand All @@ -161,7 +163,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
"version": "3.11.0"
}
},
"nbformat": 4,
Expand Down
Loading

0 comments on commit 5809641

Please sign in to comment.