A simple tool to clean, test and fix notebooks for your repo
You can install from pypi:
pip install nb_helpers
or get latest:
pip install -e .
This little library gives you command line tools to clean, test and check your jupyter notebooks.
- Clean: When you call
clean_nbs
it will strip notebooks from the metadata, this helps prevent git conflicts. You can also pass the flag--clear_outs
and also remove cell outputs.
$ nb_helpers.clean_nbs --help tcapelle at MBP14.local (-)(main)
usage: nb_helpers.clean_nbs [-h] [--path PATH] [--clear_outs] [--verbose]
Clean notebooks on `path` from useless metadata
options:
-h, --help show this help message and exit
--path PATH The path to notebooks (default: .)
--clear_outs Remove cell outputs (default: False)
--verbose Rnun on verbose mdoe (default: False)
You can run this comman on this repo:
$ nb_helpers.clean_nbs
>
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Notebook Path ┃ Status ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ tests/data/dummy_folder/fail_nb.ipynb │ Ok✔ │
│ tests/data/dummy_folder/test_nb2.ipynb │ Ok✔ │
│ tests/data/dummy_folder/test_nb_all_slow.ipynb │ Ok✔ │
│ tests/data/dummy_folder/test_nb_some_slow.ipynb │ Ok✔ │
│ tests/data/features_nb.ipynb │ Ok✔ │
│ tests/data/test_nb.ipynb │ Ok✔ │
└─────────────────────────────────────────────────┴────────┘
- Run: One can run the notebooks in
path
and get info about the execution.
$ nb_helpers.run_nbs --help tcapelle at MBP14 (--)(main)
usage: nb_helpers.run_nbs [-h] [--verbose] [--lib_name LIB_NAME] [--no_run] [--pip_install] [--github_issue] [--repo REPO] [--owner OWNER] [path]
positional arguments:
path A path to nb files (default: /Users/tcapelle/wandb/nb_helpers)
options:
-h, --help show this help message and exit
--verbose Print errors along the way (default: False)
--lib_name LIB_NAME Python lib names to filter, eg: tensorflow
--no_run Do not run any notebook (default: False)
--pip_install Run cells with !pip install (default: False)
--github_issue Create a github issue if notebook fails (default: False)
--repo REPO Github repo to create issue in (default: nb_helpers)
--owner OWNER Github owner to create issue in (default: wandb)
You can now post github issues when running fails, the cool thing is that it can be posted to another repo than the one from the notebooks. Just pass the
--repo
name and the--owner
(for examplewandb/other_cool_repo
)
You get the following output inside this repo:
$ nb_helpers.run_nbs
CONSOLE.is_terminal(): True
Writing output to run.csv
Notebook Path | Status | Run Time | colab |
---|---|---|---|
dev_nbs/search.ipynb | Fail | 1 s | open |
tests/data/dummy_folder/fail_nb.ipynb | Fail | 1 s | open |
tests/data/dummy_folder/test_nb2.ipynb | Ok | 0 s | open |
tests/data/dummy_folder/test_nb_all_slow.ipynb | Skipped | 0 s | open |
tests/data/dummy_folder/test_nb_some_slow.ipynb | Ok | 0 s | open |
tests/data/features_nb.ipynb | Ok | 0 s | open |
tests/data/test_nb.ipynb | Ok | 0 s | open |
- Summary:
You can get a summary of the notebooks in your project with the
nb_helpers.summary_nbs
function.
$ nb_helpers.summary_nbs
CONSOLE.is_terminal(): True
Writing output to /Users/tcapelle/wandb/nb_helpers/logs/summary.csv
Reading 6 notebooks
┌───┬─────────────────────────────────────────────────┬────────────┬────────────────┬────────────────────────────────────────────────┬────────────┬───────┐
│ # │ nb name │ tracker │ wandb features │ python libs │ colab_cell │ colab │
├───┼─────────────────────────────────────────────────┼────────────┼────────────────┼────────────────────────────────────────────────┼────────────┼───────┤
│ 1 │ tests/data/dummy_folder/fail_nb.ipynb │ │ │ │ │ open │
│ 2 │ tests/data/dummy_folder/test_nb2.ipynb │ │ │ │ │ open │
│ 3 │ tests/data/dummy_folder/test_nb_all_slow.ipynb │ │ │ time │ │ open │
│ 4 │ tests/data/dummy_folder/test_nb_some_slow.ipynb │ │ │ time │ │ open │
│ 5 │ tests/data/features_nb.ipynb │ │ │ typing, itertools │ │ open │
│ 6 │ tests/data/test_nb.ipynb │ 0: tracker │ │ os, sys, logging, pathlib, fastcore, itertools │ 1 │ open │
└───┴─────────────────────────────────────────────────┴────────────┴────────────────┴────────────────────────────────────────────────┴────────────┴───────┘
All this functions can also be used inside python:
from pathlib import Path
from nb_helpers.run import run_nbs
examples_path = Path("examples/colabs")
errors = run_nbs(path=examples_path, verbose=True, timeout=600)
Also the library has many little functions to make your life easier inside the repo you are orchestrating:
from pathlib import Path
from nb_helpers.utils import *
from nb_helpers.colab import *
examples_path = Path("examples/colabs")
# get all nbs in the folder recursevely, filters hidden, non nb stuff
nb_files = find_nbs(example_path)
one_nb_path = nb_files[0]
notebook = read_nb(one_nb_path)
# get all libs imported
libs = detect_imported_libs(notebook)
# get remote github repo
github_repo = git_origin_repo(one_nb_path)
# detect if master is called main or master
master_name = git_main_name(one_nb_path)
# get colab link
colab_url = get_colab_url(one_nb_path, branch)