Streamline debugging documentation (#3608)

* Add first draft Signed-off-by: lrcouto <[email protected]> * Remoe outdated kedro jupyter convert docs Signed-off-by: Ahdra Merali <[email protected]> * Suggestion: Review edits Signed-off-by: Ahdra Merali <[email protected]> * Update FAQs Signed-off-by: Ahdra Merali <[email protected]> * Edit jupyter ipython debug section Signed-off-by: lrcouto <[email protected]> * Change link to section that does not exist anymore Signed-off-by: L. R. Couto <[email protected]> * Change link to section that does not exist anymore Signed-off-by: L. R. Couto <[email protected]> * Change wording and formatting Signed-off-by: lrcouto <[email protected]> * Lint Signed-off-by: lrcouto <[email protected]> * Update docs/source/notebooks_and_ipython/kedro_and_notebooks.md Co-authored-by: Jo Stichbury <[email protected]> Signed-off-by: L. R. Couto <[email protected]> * Update docs/source/notebooks_and_ipython/kedro_and_notebooks.md Co-authored-by: Ahdra Merali <[email protected]> Signed-off-by: L. R. Couto <[email protected]> * Changes to the wording, remove unnecessary section Signed-off-by: lrcouto <[email protected]> * Move docs on debugging with hooks to hooks section Signed-off-by: Ahdra Merali <[email protected]> * Add links to main debugging page Signed-off-by: Ahdra Merali <[email protected]> * Make notebook debugging an independent section Signed-off-by: Ahdra Merali <[email protected]> * Update link in FAQs Signed-off-by: Ahdra Merali <[email protected]> * Apply suggestions from code review - adjust wording Co-authored-by: Jo Stichbury <[email protected]> Signed-off-by: Ahdra Merali <[email protected]> * Capitalise Hooks Signed-off-by: Ahdra Merali <[email protected]> * Reorder links on debugging page Signed-off-by: Ahdra Merali <[email protected]> * Use markdown admonitions Signed-off-by: Ahdra Merali <[email protected]> * Add short explanations to debugging page Signed-off-by: Ahdra Merali <[email protected]> --------- Signed-off-by: lrcouto <[email protected]> Signed-off-by: Ahdra Merali <[email protected]> Signed-off-by: Ahdra Merali <[email protected]> Signed-off-by: L. R. Couto <[email protected]> Signed-off-by: L. R. Couto <[email protected]> Co-authored-by: lrcouto <[email protected]> Co-authored-by: L. R. Couto <[email protected]> Co-authored-by: Jo Stichbury <[email protected]>
kedro-org · Feb 12, 2024 · 80ad182 · 80ad182
1 parent f54c6fb
commit 80ad182
Show file tree

Hide file tree

Showing 4 changed files with 90 additions and 87 deletions.
diff --git a/docs/source/development/debugging.md b/docs/source/development/debugging.md
@@ -1,83 +1,12 @@
 # Debugging
 
-## Introduction
+:::note
 
-If you're running your Kedro pipeline from the CLI or you can't/don't want to run Kedro from within your IDE debugging framework, it can be hard to debug your Kedro pipeline or nodes. This is particularly frustrating because:
+Our debugging documentation has moved. Please see our existing guides:
 
-* If you have long running nodes or pipelines, inserting `print` statements and running them multiple times quickly becomes time-consuming.
-* Debugging nodes outside the `run` session isn't very helpful because getting access to the local scope within the `node` can be hard, especially if you're dealing with large data or memory datasets, where you need to chain a few nodes together or re-run your pipeline to produce the data for debugging purposes.
+:::
 
-This guide provides examples on [how to instantiate a post-mortem debugging session](https://docs.python.org/3/library/pdb.html#pdb.post_mortem) with [`pdb`](https://docs.python.org/3/library/pdb.html) using [Kedro Hooks](../hooks/introduction.md) when an uncaught error occurs during a pipeline run. [ipdb](https://pypi.org/project/ipdb/) could be integrated in the same manner.
-
-For guides on how to set up debugging with IDEs, please visit the [guide for debugging in VSCode](./set_up_vscode.md#debugging) and the [guide for debugging in PyCharm](./set_up_pycharm.md#debugging).
-
-## Debugging a node
-
-To start a debugging session when an uncaught error is raised within your `node`, implement the `on_node_error` [Hook specification](/api/kedro.framework.hooks):
-
-```python
-import pdb
-import sys
-import traceback
-
-from kedro.framework.hooks import hook_impl
-
-
-class PDBNodeDebugHook:
-    """A hook class for creating a post mortem debugging with the PDB debugger
-    whenever an error is triggered within a node. The local scope from when the
-    exception occured is available within this debugging session.
-    """
-
-    @hook_impl
-    def on_node_error(self):
-        _, _, traceback_object = sys.exc_info()
-
-        #  Print the traceback information for debugging ease
-        traceback.print_tb(traceback_object)
-
-        # Drop you into a post mortem debugging session
-        pdb.post_mortem(traceback_object)
-```
-
-You can then register this `PDBNodeDebugHook` in your project's `settings.py`:
-
-```python
-HOOKS = (PDBNodeDebugHook(),)
-```
-
-## Debugging a pipeline
-
-To start a debugging session when an uncaught error is raised within your `pipeline`, implement the `on_pipeline_error` [Hook specification](/api/kedro.framework.hooks):
-
-```python
-import pdb
-import sys
-import traceback
-
-from kedro.framework.hooks import hook_impl
-
-
-class PDBPipelineDebugHook:
-    """A hook class for creating a post mortem debugging with the PDB debugger
-    whenever an error is triggered within a pipeline. The local scope from when the
-    exception occured is available within this debugging session.
-    """
-
-    @hook_impl
-    def on_pipeline_error(self):
-        # We don't need the actual exception since it is within this stack frame
-        _, _, traceback_object = sys.exc_info()
-
-        #  Print the traceback information for debugging ease
-        traceback.print_tb(traceback_object)
-
-        # Drop you into a post mortem debugging session
-        pdb.post_mortem(traceback_object)
-```
-
-You can then register this `PDBPipelineDebugHook` in your project's `settings.py`:
-
-```python
-HOOKS = (PDBPipelineDebugHook(),)
-```
+* [Debugging a Kedro project within a notebook](../notebooks_and_ipython/kedro_and_notebooks.md#debugging-a-kedro-project-within-a-notebook) for information on how to launch an interactive debugger in your notebook.
+* [Debugging in VSCode](./set_up_vscode.md#debugging) for information on how to set up VSCode's built-in debugger.
+* [Debugging in PyCharm](./set_up_pycharm.md#debugging) for information on using PyCharm's debugging tool.
+* [Debugging in the CLI with Kedro Hooks](../hooks/common_use_cases.md#use-hooks-to-debug-your-pipeline) for information on how to automatically launch an interactive debugger in the CLI when an error occurs in your pipeline run.
diff --git a/docs/source/faq/faq.md b/docs/source/faq/faq.md
@@ -14,7 +14,7 @@ This is a growing set of technical FAQs. The [product FAQs on the Kedro website]
 
 ## Working with Jupyter
 
-* [How can I debug a Kedro project in a Jupyter notebook](../notebooks_and_ipython/kedro_and_notebooks.md#debugging-with-debug-and-pdb)?
+* [How can I debug a Kedro project in a Jupyter notebook](../notebooks_and_ipython/kedro_and_notebooks.md#debugging-a-kedro-project-within-a-notebook)?
 * [How do I connect a Kedro project kernel to other Jupyter clients like JupyterLab](../notebooks_and_ipython/kedro_and_notebooks.md#ipython-jupyterlab-and-other-jupyter-clients)?
 
 ## Kedro project development

diff --git a/docs/source/hooks/common_use_cases.md b/docs/source/hooks/common_use_cases.md
@@ -201,7 +201,7 @@ HOOKS = (AzureSecretsHook(),)
 Note: `DefaultAzureCredential()` is Azure's recommended approach to authorise access to data in your storage accounts. For more information, consult the [documentation about how to authenticate to Azure and authorize access to blob data](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-quickstart-blobs-python).
 ```
 
-## Use a Hook to read `metadata` from `DataCatalog`
+## Use Hooks to read `metadata` from `DataCatalog`
 Use the `after_catalog_created` Hook to access `metadata` to extend Kedro.
 
 ```python
@@ -214,3 +214,77 @@ class MetadataHook:
         for dataset_name, dataset in catalog.datasets.__dict__.items():
             print(f"{dataset_name} metadata: \n  {str(dataset.metadata)}")
 ```
+
+## Use Hooks to debug your pipeline
+You can use Hooks to launch a [post-mortem debugging session](https://docs.python.org/3/library/pdb.html#pdb.post_mortem) with [`pdb`](https://docs.python.org/3/library/pdb.html) using [Kedro Hooks](../hooks/introduction.md) when an error occurs during a pipeline run. [ipdb](https://pypi.org/project/ipdb/) could be integrated in the same manner.
+
+### Debugging a node
+
+To start a debugging session when an error is raised within your `node` that is not caught, implement the `on_node_error` [Hook specification](/api/kedro.framework.hooks):
+
+```python
+import pdb
+import sys
+import traceback
+
+from kedro.framework.hooks import hook_impl
+
+
+class PDBNodeDebugHook:
+    """A hook class for creating a post mortem debugging with the PDB debugger
+    whenever an error is triggered within a node. The local scope from when the
+    exception occured is available within this debugging session.
+    """
+
+    @hook_impl
+    def on_node_error(self):
+        _, _, traceback_object = sys.exc_info()
+
+        #  Print the traceback information for debugging ease
+        traceback.print_tb(traceback_object)
+
+        # Drop you into a post mortem debugging session
+        pdb.post_mortem(traceback_object)
+```
+
+You can then register this `PDBNodeDebugHook` in your project's `settings.py`:
+
+```python
+HOOKS = (PDBNodeDebugHook(),)
+```
+
+### Debugging a pipeline
+
+To start a debugging session when an error is raised within your `pipeline` that is not caught, implement the `on_pipeline_error` [Hook specification](/api/kedro.framework.hooks):
+
+```python
+import pdb
+import sys
+import traceback
+
+from kedro.framework.hooks import hook_impl
+
+
+class PDBPipelineDebugHook:
+    """A hook class for creating a post mortem debugging with the PDB debugger
+    whenever an error is triggered within a pipeline. The local scope from when the
+    exception occured is available within this debugging session.
+    """
+
+    @hook_impl
+    def on_pipeline_error(self):
+        # We don't need the actual exception since it is within this stack frame
+        _, _, traceback_object = sys.exc_info()
+
+        #  Print the traceback information for debugging ease
+        traceback.print_tb(traceback_object)
+
+        # Drop you into a post mortem debugging session
+        pdb.post_mortem(traceback_object)
+```
+
+You can then register this `PDBPipelineDebugHook` in your project's `settings.py`:
+
+```python
+HOOKS = (PDBPipelineDebugHook(),)
+```
diff --git a/docs/source/notebooks_and_ipython/kedro_and_notebooks.md b/docs/source/notebooks_and_ipython/kedro_and_notebooks.md
@@ -209,14 +209,8 @@ You don't need to restart the kernel for the `catalog`, `context`, `pipelines` a
 
 For more details, run `%reload_kedro?`.
 
-## Useful to know (for advanced users)
-Each Kedro project has its own Jupyter kernel so you can switch between Kedro projects from a single Jupyter instance by selecting the appropriate kernel.
-
-To ensure that a Jupyter kernel always points to the correct Python executable, if one already exists with the same name `kedro_<package_name>`, then it is replaced.
-
-You can use the `jupyter kernelspec` set of commands to manage your Jupyter kernels. For example, to remove a kernel, run `jupyter kernelspec remove <kernel_name>`.
 
-### Debugging with %debug and %pdb
+## Debugging a Kedro project within a notebook
 
  You can use the `%debug` [line magic](https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-debug) to launch an interactive debugger in your Jupyter notebook. Declare it before a single-line statement to step through the execution in debug mode. You can use the argument `--breakpoint` or `-b` to provide a breakpoint.
 The follow sequence occurs when `%debug` runs immediately after an error occurs:
@@ -264,6 +258,12 @@ Some examples of the possible commands that can be used to interact with the ipd
 
 For more information, use the `help` command in the debugger, or take at the [ipdb repository](https://github.com/gotcha/ipdb) for guidance.
 
+## Useful to know (for advanced users)
+Each Kedro project has its own Jupyter kernel so you can switch between Kedro projects from a single Jupyter instance by selecting the appropriate kernel.
+
+To ensure that a Jupyter kernel always points to the correct Python executable, if one already exists with the same name `kedro_<package_name>`, then it is replaced.
+
+You can use the `jupyter kernelspec` set of commands to manage your Jupyter kernels. For example, to remove a kernel, run `jupyter kernelspec remove <kernel_name>`.
 
 ### Managed services