Microservice to generate Jupyter reports by executing notebook and exporting them to static HTML page combining papermill and nbconvert.
It can be run as a standalone application or as a JupyterHub service.
The workflow is described in the figure below. You can either:
- Set a specific URL to select a notebook and pass query arguments as parameters,
- Or interactively select one notebook and set its parameters.
The API is described there.
There are two features provided by this service:
- Listing all available report templates (and their parameters)
The available templates are all notebook files existing within the
template_root_dir
. - Generate a report (i.e. execute a parametrized notebook and convert it to HTML)
Parametrized notebook are supported only for Python notebook.
The configurable settings for the service are:
broken_reports_dir
: Folder in which broken notebook will be copied - it must be a subfolder ofnotebook_dir
; default /home/USERNAME/broken_reportsconfig_file
: Configuration file name; default papermill_service_configgit_auth
: Git authentication (username:password); default Nonenotebook_dir
: Notebook server root directory; default /home/USERNAME This is needed to build the link to broken notebook.port
: Port of the service; default 8888template_root_dir
: Folder containing the notebook templates on the server; default /opt/papermill_reporttemplate_dir
: Folder of the Git repository containing the notebook templates; default "."template_git_url
: Git repository URL source of the notebook templates; default Nonetemplate_paths
: Paths to search for service webpage jinja templates, before using the default templates; default None
The string USERNAME will be replaced with the user’s username if used in
broken_reports_dir
ornotebook_dir
.
- On Unix platforms, the service must be run as
root
because report processes are executed throughsu <user> --login
command to impersonate the authenticated user and setting the environment variables afresh.
This python package is meant to be deployed as a JupyterHub Hub-Managed service.
The consequences are:
- The service runs its own tornado server. Requests will be forwarded to it
by the JupyterHub internal proxy from the standard URL
https://myhub.horse/services/my-service/
(pay attention to the required trailing/
). - Authentication is deferred to JupyterHub
- As it is managed by JupyterHub, JupyterHub will check that the service is alive. And if not, it will restart it. Moreover, when JupyterHub is nicely stopped, it will stop the service.
There is 2 levels of test in this service. Some conventional unit tests with
pytest
and a Dockerfile to spin an integrated environment starting the service
as JupyterHub service.
python -m install -r requirements.txt -r requirements_dev.txt
pytest papermill_report
To build and launch the integrated environment:
docker build -t papermill-report .
docker run -p 8000:8000 --rm papermill-report
The Hub is parameterized (see jupyterhub_config.py) with two users:
- jovyan: an administrator
- marc: an user
There are no password on the accounts.
The template folder is the examples
folder of this project.
You can also test the service manually by visiting valid endpoints:
http://localhost:8000/services/report/
http://localhost:8000/services/report/broken_parameters.ipynb
http://localhost:8000/services/report/no_parameters.ipynb
http://localhost:8000/services/report/subfolder/simple_execute.ipynb&msg=hello
Integration tests can be executed automatically using that environment with the following command:
docker-compose -f e2e-tests/docker-compose.yml run e2e ./e2e-tests/run_e2e.sh
docker-compose -f e2e-tests/docker-compose.yml down