-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Client hooks #188
Client hooks #188
Conversation
85ed4b0
to
4f46196
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this PR @devintang3.
Would you like to add these new features to the documentation also?
nbclient/util.py
Outdated
loop = asyncio.get_event_loop() | ||
hook_with_kwargs = partial(hook, **kwargs) | ||
future = loop.run_in_executor(None, hook_with_kwargs) | ||
asyncio.ensure_future(future) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand, does it mean that hooks are going to be launched in the background? BTW, it would be nice to test async hooks.
I think run_hook
should be async
:
async def run_hook(hook: Optional[Callable], **kwargs) -> None:
await ensure_async(hook(**kwargs))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the intent is for the hook to be launched in the background.
4f46196
to
8b54421
Compare
Thanks for the suggestions. I've applied your recommendations as well as added more tests.
|
Could you rebase on master? It looks like you merged master into your branch instead. |
207cac2
to
1de1dbd
Compare
nbclient/client.py
Outdated
@@ -805,11 +865,13 @@ async def async_execute_cell( | |||
self.allow_errors or "raises-exception" in cell.metadata.get("tags", []) | |||
) | |||
|
|||
await run_hook(self.on_cell_start, cell=cell, cell_index=cell_index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while we are at it, perhaps there could be a hook at the very start of this function, before even skipping non-executing cells, like on_cell_visit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, that would allow pre-processing Markdown cells.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to just move on_cell_start
to the start instead of creating a new hook?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would make sense, and we know if the cell is a Markdown cell with cell.cell_type
.
But then maybe add on_cell_execute
right before the execution? It won't be called if the cell is not a code cell or if the cell is skipped because of a tag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved on_cell_start
towards the top so it'll execute for non-code cells and added a new hook called on_cell_execute
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a really nice extension for nbclient
- I have a few comments and suggestions, mostly around the API design and the documentation to make it clear how to use it
docs/client.rst
Outdated
In addition to the two above, we also support traitlets for hooks. They are as | ||
follows: ``on_execution_start``, ``on_cell_start``, ``on_cell_complete``, | ||
``on_cell_error``. These traitlets allow specifying a ``Callable`` function, | ||
which will run at certain points during the notebook execution and is executed asynchronously. | ||
``on_execution_start`` will run when the notebook client is kicked off. | ||
``on_cell_start`` will run right before each cell is executed. | ||
``on_cell_complete`` will run right after the cell is executed. | ||
``on_cell_error`` will run if there is an error in the cell. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to the two above, we also support traitlets for hooks. They are as | |
follows: ``on_execution_start``, ``on_cell_start``, ``on_cell_complete``, | |
``on_cell_error``. These traitlets allow specifying a ``Callable`` function, | |
which will run at certain points during the notebook execution and is executed asynchronously. | |
``on_execution_start`` will run when the notebook client is kicked off. | |
``on_cell_start`` will run right before each cell is executed. | |
``on_cell_complete`` will run right after the cell is executed. | |
``on_cell_error`` will run if there is an error in the cell. | |
Hooks before and after cell execution | |
~~~~~~~~~~~~~~~~~~~~~~~~ | |
There are several configurable hooks that allow the user to execute code before and | |
after a cell is executed. Each one is configured with a function that will be called in its | |
respective place in the cell execution pipeline. These function calls are **asynchronous**. | |
Each is described below: | |
**Notebook-level hooks** | |
These hooks are called with a single extra parameter: | |
- ``notebook=NotebookNode``: the current notebook being executed. | |
Here is the available hook: | |
- ``on_execution_start`` will run when the notebook client is initialized, before any execution has happened. | |
**Cell-level hooks** | |
These hooks are called with two parameters: | |
- ``cell=NotebookNode``: a reference to the current cell. | |
- ``cell_index=int``: the index of the cell in the current notebook's list of cells | |
Here are the available hooks: | |
- ``on_cell_start`` will run right before each cell is executed. | |
- ``on_cell_complete`` will run after execution, if the cell is executed with no errors. | |
- ``on_cell_error`` will run if there is an error during cell execution. | |
This fleshes out the documentation a bit so that it is clearer which hooks operate on which parts of the document, and so that it's clearer what arguments are passed to the hooks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth mentioning that the hooks can be async, but don't have to be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @choldgraf! Much appreciated.
I removed the line about functions can be async. If it can be async or sync, then I don't see much point in mentioning that
docs/client.rst
Outdated
@@ -96,6 +96,15 @@ on both versions. Here the traitlet ``kernel_name`` helps simplify and | |||
maintain consistency: we can just run a notebook twice, specifying first | |||
"python2" and then "python3" as the kernel name. | |||
|
|||
In addition to the two above, we also support traitlets for hooks. They are as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also document what kinds of arguments are passed to these functions, or what variables are available to them when they are called (e.g., if I wrote a function for this, how would I access the notebook? the current cell? the traceback of an error message if it failed to execute?)
nbclient/client.py
Outdated
@@ -426,6 +483,7 @@ async def async_start_new_kernel_client(self) -> KernelClient: | |||
await self._async_cleanup_kernel() | |||
raise | |||
self.kc.allow_stdin = False | |||
await run_hook(self.on_execution_start) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
await run_hook(self.on_execution_start) | |
await run_hook(self.on_execution_start, notebook=self.nb) |
What about something like this so that people have access to the NotebookNode before execution?
nbclient/client.py
Outdated
@@ -426,6 +483,7 @@ async def async_start_new_kernel_client(self) -> KernelClient: | |||
await self._async_cleanup_kernel() | |||
raise | |||
self.kc.allow_stdin = False | |||
await run_hook(self.on_execution_start) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For symmetry's sake, why not also include two extra notebook-level hooks: on_notebook_complete
and on_notebook_error
, that map on to the two cell execution hooks but at a notebook level?
That way you would have the same basic pattern for both the notebook, and for each cell in the notebook:
- hook for just before execution happens (
on_notebook_start
,on_cell_start
) - hook for when execution completes (
on_notebook_complete
,on_cell_complete
) - hook for when execution has an error (
on_notebook_error
,on_cell_error
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added on_notebook_error
, although I'm not sure if I did that part correctly. The only time I've personally seen the notebook fail is when there's a RuntimeError, so that's where I've placed the hook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair enough - and feel free to push back if you think that this is "too many hooks all at once" 😅 on the one hand I like symmetry and intentional design for extension points. On the other hand, I also believe in not building things until somebody actively wants it to be built :-) so if you think this is too much complexity, another option could be to just add comments for where you think the extra hooks could go, and wait to build them until a user explicitly asks for it.
This will enable tracking of execution process without subclassing the way papermill does.
1de1dbd
to
3892c04
Compare
nbclient/tests/test_client.py
Outdated
] | ||
loop = asyncio.get_event_loop() | ||
loop.run_until_complete(asyncio.gather(*tasks)) | ||
[async_run_notebook(input_file.format(label=label), opts, res) for label in ("A", "B")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You removed the execution of these tasks, can you explain why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was a mistake on my part when I was trying to investigate some failures. I've added those changes back in.
nbclient/tests/test_client.py
Outdated
tasks = [async_run_notebook(input_file, opts, res) for i in range(4)] | ||
loop = asyncio.get_event_loop() | ||
loop.run_until_complete(asyncio.gather(*tasks)) | ||
[async_run_notebook(input_file, opts, res) for i in range(4)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
3892c04
to
20f4289
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @devintang3, that looks good to me. A couple of notes, but I think this PR can be merged anyway:
- maybe use in the tests:
hooks = [MagicMock() for i in range(7)]
. - have all the hook help strings start with
A callable which executes...
.
Run_hook is now async and renamed util to test_util so it gets picked up by pytest. Also added new hooks: on_notebook_error, on_cell_execution Updated docs
20f4289
to
51c3ece
Compare
That makes sense to me. I've updated the tests to do a loop and updated the help strings as well |
Thanks @devintang3 ! |
🚀 |
Continuation of #79
I rebased with master and added some tests. I occasionally get errors on
test_many_parallel_notebooks
which I don't think is related, but I'll also get it ontest_error_cell_hooks
in py38. It's an intermittent error that gets resolved if I add a sleep, so I think it's something related to the async task not running right away. Any guidance here would be appreciated.I also removed py36 from tox since it looked like the minimum version is now Python3.7 and it was just hanging when I tried running it.