Reorganise the documentation structure for Kedro / Databricks integra…

…tion (#2442) * Remove visualization docs from deployment section Signed-off-by: Jannic Holzer <[email protected]> * Add new directory and new visualization docs Signed-off-by: Jannic Holzer <[email protected]> * Remove old deployment docs Signed-off-by: Jannic Holzer <[email protected]> * Add deployment docs to new directory Signed-off-by: Jannic Holzer <[email protected]> * Modify spelling of visualize to British English 'visualise' Signed-off-by: Jannic Holzer <[email protected]> * Add new documentation to index Signed-off-by: Jannic Holzer <[email protected]> * Fix lint Signed-off-by: Jannic Holzer <[email protected]> * Modify title of deployment guide Signed-off-by: Jannic Holzer <[email protected]> * Remove spurious max depth to test if docs build Signed-off-by: Jannic Holzer <[email protected]> * Refactor index.rst to try to avoid build failing Signed-off-by: Jannic Holzer <[email protected]> * Modify call to sphinx-build to test if RTD will work Signed-off-by: Jannic Holzer <[email protected]> * Revise index.rst Signed-off-by: Jo Stichbury <[email protected]> * Lint and resolve Signed-off-by: Jo Stichbury <[email protected]> Signed-off-by: Jannic Holzer <[email protected]> * Change title to include mention of Notebooks Signed-off-by: Jannic Holzer <[email protected]> * Remove verbosity from viz on Databricks intro Signed-off-by: Jannic Holzer <[email protected]> * Revert command modifcation Signed-off-by: Jannic Holzer <[email protected]> * Rename databricks visualisation docs Signed-off-by: Jannic Holzer <[email protected]> * Add a line between copy and code snippets for rendering Co-authored-by: Jo Stichbury <[email protected]> Signed-off-by: Jannic Holzer <[email protected]> * Remove gerund Co-authored-by: Jo Stichbury <[email protected]> Signed-off-by: Jannic Holzer <[email protected]> * Remove spurious 'i.e.' Co-authored-by: Jo Stichbury <[email protected]> Signed-off-by: Jannic Holzer <[email protected]> * Rename workflow_integration to integrations Signed-off-by: Jannic Holzer <[email protected]> * Rename index entry to 'Integrations' Signed-off-by: Jannic Holzer <[email protected]> * Convert databricks.rst to MyST format Signed-off-by: Jannic Holzer <[email protected]> * Rename databricks.rst to databricks.md Signed-off-by: Jannic Holzer <[email protected]> * Remove spurious conflict messages Signed-off-by: Jannic Holzer <[email protected]> --------- Signed-off-by: Jannic Holzer <[email protected]> Signed-off-by: Jo Stichbury <[email protected]> Co-authored-by: Jo Stichbury <[email protected]>
kedro-org · Mar 30, 2023 · c2968ba · c2968ba
1 parent cdcd665
commit c2968ba
Show file tree

Hide file tree

Showing 7 changed files with 52 additions and 14 deletions.
diff --git a/docs/source/deployment/deployment_guide.md b/docs/source/deployment/deployment_guide.md
@@ -15,7 +15,6 @@ We also provide information to help you deploy to the following:
 * to [Prefect](prefect.md)
 * to [Kubeflow Workflows](kubeflow.md)
 * to [AWS Batch](aws_batch.md)
-* to [Databricks](databricks.md)
 * to [Dask](dask.md)
 
 <!--- There has to be some non-link text in the bullets above, if it's just links, there's a Sphinx bug that fails the build process-->

diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -152,6 +152,13 @@ Welcome to Kedro's documentation!
 
    logging/logging
 
+.. toctree::
+   :maxdepth: 2
+   :caption: Integrations
+
+   integrations/databricks.rst
+   integrations/pyspark.rst
+
 .. toctree::
    :maxdepth: 2
    :caption: Development
@@ -174,12 +181,17 @@ Welcome to Kedro's documentation!
    deployment/prefect
    deployment/kubeflow
    deployment/aws_batch
-   deployment/databricks
    deployment/aws_sagemaker
    deployment/aws_step_functions
    deployment/airflow_astronomer
    deployment/dask
 
+.. toctree::
+   :maxdepth: 2
+   :caption: Databricks integration
+
+   databricks_integration/visualisation
+
 .. toctree::
    :maxdepth: 2
    :caption: PySpark integration
@@ -188,9 +200,16 @@ Welcome to Kedro's documentation!
 
 .. toctree::
    :maxdepth: 2
-   :caption: Resources
+   :caption: FAQs
 
    faq/faq
+   faq/architecture_overview
+   faq/kedro_principles
+
+.. toctree::
+   :maxdepth: 2
+   :caption: Resources
+
    resources/glossary
 
 

diff --git a/docs/source/integrations/databricks.md b/docs/source/integrations/databricks.md
@@ -0,0 +1,9 @@
+# Databricks integration
+
+```{toctree}
+:caption: Databricks
+:maxdepth: 2
+
+databricks_workspace.md
+visualisation.md
+```
diff --git a/docs/source/integrations/databricks_visualisation.md b/docs/source/integrations/databricks_visualisation.md
@@ -0,0 +1,12 @@
+# How to run Kedro-Viz on Databricks
+
+[Kedro-Viz](../visualisation/kedro-viz_visualisation.md) is a tool that allows you to visualise your Kedro pipeline. It is a standalone web application that runs on a web browser, it can be run on a local machine or on Databricks itself.
+
+For Kedro-Viz to run with your Kedro project, you need to ensure that both the packages are installed in the same scope (notebook-scoped vs. cluster library). This means that if you `%pip install kedro` from inside your notebook then you should also `%pip install kedro-viz` from inside your notebook.
+If your cluster comes with Kedro installed on it as a library already then you should also add Kedro-Viz as a [cluster library](https://docs.microsoft.com/en-us/azure/databricks/libraries/cluster-libraries).
+
+Kedro-Viz can then be launched in a new browser tab with the `%run_viz` line magic:
+
+```ipython
+In [2]: %run_viz
+```
diff --git a/docs/source/deployment/databricks.md → ...urce/integrations/databricks_workspace.md b/docs/source/deployment/databricks.md → ...urce/integrations/databricks_workspace.md
@@ -1,4 +1,4 @@
-# Deployment to a Databricks cluster
+# Develop a project with Databricks Workspace and Notebooks
 
 This tutorial uses the [PySpark Iris Kedro Starter](https://github.com/kedro-org/kedro-starters/tree/main/pyspark-iris) to illustrate how to bootstrap a Kedro project using Spark and deploy it to a [Databricks cluster on AWS](https://databricks.com/aws).
 
@@ -252,16 +252,6 @@ You must explicitly upgrade your `pip` version by doing the below:
 
 After this, you can reload Kedro by running the line magic command `%reload_kedro <project_root>`.
 
-### 10. Running Kedro-Viz on Databricks
-
-For Kedro-Viz to run with your Kedro project, you need to ensure that both the packages are installed in the same scope (notebook-scoped vs. cluster library). i.e. if you `%pip install kedro` from inside your notebook then you should also `%pip install kedro-viz` from inside your notebook.
-If your cluster comes with Kedro installed on it as a library already then you should also add Kedro-Viz as a [cluster library](https://docs.microsoft.com/en-us/azure/databricks/libraries/cluster-libraries).
-
-Kedro-Viz can then be launched in a new browser tab with the `%run_viz` line magic:
-```ipython
-In [2]: %run_viz
-```
-
 ## How to use datasets stored on Databricks DBFS
 
 DBFS is a distributed file system mounted into a DataBricks workspace and accessible on a DataBricks cluster. It maps cloud object storage URIs to relative paths so as to simplify the process of persisting files. With DBFS, libraries can read from or write to distributed storage as if it's a local file.

diff --git a/docs/source/integrations/pyspark.rst b/docs/source/integrations/pyspark.rst
@@ -0,0 +1,9 @@
+
+PySpark integration
+=============================================
+
+.. toctree::
+   :maxdepth: 2
+   :caption: PySpark
+
+   pyspark_integration.md
diff --git a/docs/source/tools_integration/pyspark.md → ...ource/integrations/pyspark_integration.md b/docs/source/tools_integration/pyspark.md → ...ource/integrations/pyspark_integration.md