diff --git a/.github/workflows/check-markdown-links.yml b/.github/workflows/check-markdown-links.yml new file mode 100644 index 00000000000..bdfdb7a5795 --- /dev/null +++ b/.github/workflows/check-markdown-links.yml @@ -0,0 +1,20 @@ +--- +name: Check Markdown Links +on: + pull_request: + types: [opened, synchronize, reopened] +jobs: + check-links: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + - name: Set up Python + uses: actions/setup-python@v4 + with: + python-version: 3.x + - name: Install dependencies + run: pip install PyGithub + - name: Run markdown link checker + run: ./scripts/check_and_comment.sh docs + env: + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} diff --git a/docs/book/component-guide/artifact-stores/custom.md b/docs/book/component-guide/artifact-stores/custom.md index e1f7451b419..d9a8e854d22 100644 --- a/docs/book/component-guide/artifact-stores/custom.md +++ b/docs/book/component-guide/artifact-stores/custom.md @@ -156,7 +156,7 @@ zenml artifact-store flavor register flavors.my_flavor.MyArtifactStoreFlavor ``` {% hint style="warning" %} -ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](../../how-to/setting-up-a-project-repository/best-practices.md) of initializing zenml at the root of your repository. +ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](../../how-to/infrastructure-deployment/infrastructure-as-code/best-practices.md) of initializing zenml at the root of your repository. If ZenML does not find an initialized ZenML repository in any parent directory, it will default to the current working directory, but usually, it's better to not have to rely on this mechanism and initialize zenml at the root. {% endhint %} diff --git a/docs/book/component-guide/container-registries/custom.md b/docs/book/component-guide/container-registries/custom.md index dbe47d97313..d1d5bc0c98f 100644 --- a/docs/book/component-guide/container-registries/custom.md +++ b/docs/book/component-guide/container-registries/custom.md @@ -98,7 +98,7 @@ zenml container-registry flavor register flavors.my_flavor.MyContainerRegistryFl ``` {% hint style="warning" %} -ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](../../how-to/setting-up-a-project-repository/best-practices.md) of initializing zenml at the root of your repository. +ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](../../how-to/infrastructure-deployment/infrastructure-as-code/best-practices.md) of initializing zenml at the root of your repository. If ZenML does not find an initialized ZenML repository in any parent directory, it will default to the current working directory, but usually it's better to not have to rely on this mechanism, and initialize zenml at the root. {% endhint %} diff --git a/docs/book/component-guide/data-validators/custom.md b/docs/book/component-guide/data-validators/custom.md index b2e524301c2..64e1b09adb5 100644 --- a/docs/book/component-guide/data-validators/custom.md +++ b/docs/book/component-guide/data-validators/custom.md @@ -40,7 +40,7 @@ zenml data-validator flavor register flavors.my_flavor.MyDataValidatorFlavor ``` {% hint style="warning" %} -ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](../../how-to/setting-up-a-project-repository/best-practices.md) of initializing zenml at the root of your repository. +ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](../../how-to/infrastructure-deployment/infrastructure-as-code/best-practices.md) of initializing zenml at the root of your repository. If ZenML does not find an initialized ZenML repository in any parent directory, it will default to the current working directory, but usually it's better to not have to rely on this mechanism, and initialize zenml at the root. {% endhint %} diff --git a/docs/book/component-guide/experiment-trackers/custom.md b/docs/book/component-guide/experiment-trackers/custom.md index 964c49ccfb1..01c1ff1bd17 100644 --- a/docs/book/component-guide/experiment-trackers/custom.md +++ b/docs/book/component-guide/experiment-trackers/custom.md @@ -37,7 +37,7 @@ zenml experiment-tracker flavor register flavors.my_flavor.MyExperimentTrackerFl ``` {% hint style="warning" %} -ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](../../how-to/setting-up-a-project-repository/best-practices.md) of initializing zenml at the root of your repository. +ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](../../how-to/infrastructure-deployment/infrastructure-as-code/best-practices.md) of initializing zenml at the root of your repository. If ZenML does not find an initialized ZenML repository in any parent directory, it will default to the current working directory, but usually, it's better to not have to rely on this mechanism and initialize zenml at the root. {% endhint %} diff --git a/docs/book/component-guide/experiment-trackers/experiment-trackers.md b/docs/book/component-guide/experiment-trackers/experiment-trackers.md index b7cae215ce8..3edaf1a9d8a 100644 --- a/docs/book/component-guide/experiment-trackers/experiment-trackers.md +++ b/docs/book/component-guide/experiment-trackers/experiment-trackers.md @@ -13,7 +13,7 @@ through Experiment Tracker stack components. This establishes a clear link betwe Related concepts: * the Experiment Tracker is an optional type of Stack Component that needs to be registered as part of your - ZenML [Stack](/docs/book/user-guide/production-guide/understand-stacks.md). + ZenML [Stack](../../user-guide/production-guide/understand-stacks.md). * ZenML already provides versioning and tracking for the pipeline artifacts by storing artifacts in the [Artifact Store](../artifact-stores/artifact-stores.md). diff --git a/docs/book/component-guide/image-builders/custom.md b/docs/book/component-guide/image-builders/custom.md index c6d5a93e7ee..4727271dd47 100644 --- a/docs/book/component-guide/image-builders/custom.md +++ b/docs/book/component-guide/image-builders/custom.md @@ -88,7 +88,7 @@ zenml image-builder flavor register flavors.my_flavor.MyImageBuilderFlavor ``` {% hint style="warning" %} -ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](../../how-to/setting-up-a-project-repository/best-practices.md) of initializing zenml at the root of your repository. +ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](../../how-to/infrastructure-deployment/infrastructure-as-code/best-practices.md) of initializing zenml at the root of your repository. If ZenML does not find an initialized ZenML repository in any parent directory, it will default to the current working directory, but usually it's better to not have to rely on this mechanism, and initialize zenml at the root. {% endhint %} diff --git a/docs/book/component-guide/image-builders/image-builders.md b/docs/book/component-guide/image-builders/image-builders.md index 1a20f8b113c..10194836d90 100644 --- a/docs/book/component-guide/image-builders/image-builders.md +++ b/docs/book/component-guide/image-builders/image-builders.md @@ -38,7 +38,7 @@ zenml image-builder flavor list ### How to use it You don't need to directly interact with any image builder in your code. As long as the image builder that you want to -use is part of your active [ZenML stack](/docs/book/user-guide/production-guide/understand-stacks.md), it will be used +use is part of your active [ZenML stack](../../user-guide/production-guide/understand-stacks.md), it will be used automatically by any component that needs to build container images. diff --git a/docs/book/component-guide/model-deployers/custom.md b/docs/book/component-guide/model-deployers/custom.md index 37fda8682db..e5764bfcea0 100644 --- a/docs/book/component-guide/model-deployers/custom.md +++ b/docs/book/component-guide/model-deployers/custom.md @@ -143,7 +143,7 @@ zenml model-deployer flavor register flavors.my_flavor.MyModelDeployerFlavor ``` {% hint style="warning" %} -ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](../../how-to/setting-up-a-project-repository/best-practices.md) of initializing zenml at the root of your repository. +ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](../../how-to/infrastructure-deployment/infrastructure-as-code/best-practices.md) of initializing zenml at the root of your repository. If ZenML does not find an initialized ZenML repository in any parent directory, it will default to the current working directory, but usually, it's better to not have to rely on this mechanism and initialize zenml at the root. {% endhint %} diff --git a/docs/book/component-guide/orchestrators/airflow.md b/docs/book/component-guide/orchestrators/airflow.md index 56362facd10..70d68b64f83 100644 --- a/docs/book/component-guide/orchestrators/airflow.md +++ b/docs/book/component-guide/orchestrators/airflow.md @@ -159,7 +159,7 @@ of your Airflow deployment. {% hint style="info" %} ZenML will build a Docker image called `/zenml:` which includes your code and use it to run your pipeline steps in Airflow. Check -out [this page](/docs/book/how-to/customize-docker-builds/README.md) if you want to learn +out [this page](../../how-to/customize-docker-builds/README.md) if you want to learn more about how ZenML builds these images and how you can customize them. {% endhint %} @@ -204,13 +204,13 @@ The username will always be `admin`. For additional configuration of the Airflow orchestrator, you can pass `AirflowOrchestratorSettings` when defining or running your pipeline. Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration\_code\_docs/integrations-airflow/#zenml.integrations.airflow.flavors.airflow\_orchestrator\_flavor.AirflowOrchestratorSettings) -for a full list of available attributes and [this docs page](/docs/book/how-to/pipeline-development/use-configuration-files/README.md) for +for a full list of available attributes and [this docs page](../../how-to/pipeline-development/use-configuration-files/README.md) for more information on how to specify settings. #### Enabling CUDA for GPU-backed hardware Note that if you wish to use this orchestrator to run steps on a GPU, you will need to -follow [the instructions on this page](/docs/book/how-to/pipeline-development/training-with-gpus/README.md) to ensure that it +follow [the instructions on this page](../../how-to/pipeline-development/training-with-gpus/README.md) to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration. @@ -297,7 +297,7 @@ the [original module](https://github.com/zenml-io/zenml/blob/main/src/zenml/inte . For this reason, we suggest starting by copying the original and modifying it according to your needs. Check out our docs on how to apply settings to your -pipelines [here](/docs/book/how-to/pipeline-development/use-configuration-files/README.md). +pipelines [here](../../how-to/pipeline-development/use-configuration-files/README.md). For more information and a full list of configurable attributes of the Airflow orchestrator, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration_code_docs/integrations-airflow/#zenml.integrations.airflow.orchestrators.airflow_orchestrator.AirflowOrchestrator) . diff --git a/docs/book/component-guide/orchestrators/azureml.md b/docs/book/component-guide/orchestrators/azureml.md index 0cce7d75b0a..e39538c1f55 100644 --- a/docs/book/component-guide/orchestrators/azureml.md +++ b/docs/book/component-guide/orchestrators/azureml.md @@ -42,7 +42,7 @@ In order to use an AzureML orchestrator, you need to first deploy [ZenML to the cloud](../../getting-started/deploying-zenml/README.md). It would be recommended to deploy ZenML in the same region as you plan on using for AzureML, but it is not necessary to do so. You must ensure that -you are [connected to the remote ZenML server](../../how-to/connecting-to-zenml/connect-in-with-your-user-interactive.md) +you are [connected to the remote ZenML server](../../how-to/manage-zenml-server/connecting-to-zenml/connect-in-with-your-user-interactive.md) before using this stack component. ## How to use it diff --git a/docs/book/component-guide/orchestrators/custom.md b/docs/book/component-guide/orchestrators/custom.md index 539aecdd6bf..d99b9ccf6c9 100644 --- a/docs/book/component-guide/orchestrators/custom.md +++ b/docs/book/component-guide/orchestrators/custom.md @@ -100,7 +100,7 @@ zenml orchestrator flavor register flavors.my_flavor.MyOrchestratorFlavor ``` {% hint style="warning" %} -ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](../../how-to/setting-up-a-project-repository/best-practices.md) of initializing zenml at the root of your repository. +ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](../../how-to/infrastructure-deployment/infrastructure-as-code/best-practices.md) of initializing zenml at the root of your repository. If ZenML does not find an initialized ZenML repository in any parent directory, it will default to the current working directory, but usually, it's better to not have to rely on this mechanism and initialize zenml at the root. {% endhint %} diff --git a/docs/book/component-guide/orchestrators/orchestrators.md b/docs/book/component-guide/orchestrators/orchestrators.md index d5e34cec84b..b217e86aff5 100644 --- a/docs/book/component-guide/orchestrators/orchestrators.md +++ b/docs/book/component-guide/orchestrators/orchestrators.md @@ -52,9 +52,9 @@ zenml orchestrator flavor list ### How to use it You don't need to directly interact with any ZenML orchestrator in your code. As long as the orchestrator that you want -to use is part of your active [ZenML stack](/docs/book/user-guide/production-guide/understand-stacks.md), using the +to use is part of your active [ZenML stack](../../user-guide/production-guide/understand-stacks.md), using the orchestrator is as simple as executing a Python file -that [runs a ZenML pipeline](/docs/book/user-guide/starter-guide/starter-guide.md): +that [runs a ZenML pipeline](../../user-guide/starter-guide/starter-project.md): ```shell python file_that_runs_a_zenml_pipeline.py diff --git a/docs/book/component-guide/step-operators/custom.md b/docs/book/component-guide/step-operators/custom.md index 7328d9314a5..280bc5c3200 100644 --- a/docs/book/component-guide/step-operators/custom.md +++ b/docs/book/component-guide/step-operators/custom.md @@ -97,7 +97,7 @@ zenml step-operator flavor register flavors.my_flavor.MyStepOperatorFlavor ``` {% hint style="warning" %} -ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](../../how-to/setting-up-a-project-repository/best-practices.md) of initializing zenml at the root of your repository. +ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](../../how-to/infrastructure-deployment/infrastructure-as-code/best-practices.md) of initializing zenml at the root of your repository. If ZenML does not find an initialized ZenML repository in any parent directory, it will default to the current working directory, but usually, it's better to not have to rely on this mechanism and initialize zenml at the root. {% endhint %} diff --git a/docs/book/component-guide/step-operators/modal.md b/docs/book/component-guide/step-operators/modal.md index 4492152050d..d21b73a6151 100644 --- a/docs/book/component-guide/step-operators/modal.md +++ b/docs/book/component-guide/step-operators/modal.md @@ -63,7 +63,7 @@ ZenML will build a Docker image which includes your code and use it to run your #### Additional configuration You can specify the hardware requirements for each step using the -`ResourceSettings` class as described in our documentation on [resource settings](../../how-to/training-with-gpus/training-with-gpus.md): +`ResourceSettings` class as described in our documentation on [resource settings](../../how-to/pipeline-development/training-with-gpus/README.md): ```python from zenml.config import ResourceSettings diff --git a/docs/book/component-guide/step-operators/step-operators.md b/docs/book/component-guide/step-operators/step-operators.md index 146e91eb91b..581348f9499 100644 --- a/docs/book/component-guide/step-operators/step-operators.md +++ b/docs/book/component-guide/step-operators/step-operators.md @@ -48,7 +48,7 @@ zenml step-operator flavor list ### How to use it You don't need to directly interact with any ZenML step operator in your code. As long as the step operator that you -want to use is part of your active [ZenML stack](/docs/book/user-guide/production-guide/understand-stacks.md), you can simply +want to use is part of your active [ZenML stack](../../user-guide/production-guide/understand-stacks.md), you can simply specify it in the `@step` decorator of your step. ```python diff --git a/docs/book/getting-started/deploying-zenml/custom-secret-stores.md b/docs/book/getting-started/deploying-zenml/custom-secret-stores.md index dee059b8bd9..5fbd0344019 100644 --- a/docs/book/getting-started/deploying-zenml/custom-secret-stores.md +++ b/docs/book/getting-started/deploying-zenml/custom-secret-stores.md @@ -97,6 +97,6 @@ If you want to create your own custom secrets store implementation, you can foll 1. Create a class that inherits from the `zenml.zen_stores.secrets_stores.base_secrets_store.BaseSecretsStore` base class and implements the `abstractmethod`s shown in the interface above. Use `SecretsStoreType.CUSTOM` as the `TYPE` value for your secrets store class. 2. If you need to provide any configuration, create a class that inherits from the `SecretsStoreConfiguration` class and add your configuration parameters there. Use that as the `CONFIG_TYPE` value for your secrets store class. -3. To configure the ZenML server to use your custom secrets store, make sure your code is available in the container image that is used to run the ZenML server. Then, use environment variables or helm chart values to configure the ZenML server to use your custom secrets store, as covered in the [deployment guide](../README.md). +3. To configure the ZenML server to use your custom secrets store, make sure your code is available in the container image that is used to run the ZenML server. Then, use environment variables or helm chart values to configure the ZenML server to use your custom secrets store, as covered in the [deployment guide](./README.md).
ZenML Scarf
diff --git a/docs/book/getting-started/deploying-zenml/deploy-with-custom-image.md b/docs/book/getting-started/deploying-zenml/deploy-with-custom-image.md index 945adde3f21..b6242cacb9e 100644 --- a/docs/book/getting-started/deploying-zenml/deploy-with-custom-image.md +++ b/docs/book/getting-started/deploying-zenml/deploy-with-custom-image.md @@ -6,7 +6,7 @@ description: Deploying ZenML with custom Docker images. In most cases, deploying ZenML with the default `zenmlhub/zenml-server` Docker image should work just fine. However, there are some scenarios when you might need to deploy ZenML with a custom Docker image: -* You have implemented a custom artifact store for which you want to enable [artifact visualizations](../../how-to/handle-data-artifacts/visualize-artifacts.md) or [step logs](../../how-to/setting-up-a-project-repository/best-practices.md#logging) in your dashboard. +* You have implemented a custom artifact store for which you want to enable [artifact visualizations](../../how-to/data-artifact-management/visualize-artifacts/README.md) or [step logs](../../../how-to/setting-up-a-project-repository/best-practices.md#logging) in your dashboard. * You have forked the ZenML repository and want to deploy a ZenML server based on your own fork because you made changes to the server / database logic. {% hint style="warning" %} diff --git a/docs/book/getting-started/deploying-zenml/deploy-with-docker.md b/docs/book/getting-started/deploying-zenml/deploy-with-docker.md index 17e73e7b227..f3161bf529c 100644 --- a/docs/book/getting-started/deploying-zenml/deploy-with-docker.md +++ b/docs/book/getting-started/deploying-zenml/deploy-with-docker.md @@ -199,7 +199,7 @@ These configuration options are only relevant if you're using Hashicorp Vault as {% endtab %} {% tab title="Custom" %} -These configuration options are only relevant if you're using a custom secrets store backend implementation. For this to work, you must have [a custom implementation of the secrets store API](manage-the-deployed-services/custom-secret-stores.md) in the form of a class derived from `zenml.zen_stores.secrets_stores.base_secrets_store.BaseSecretsStore`. This class must be importable from within the ZenML server container, which means you most likely need to mount the directory containing the class into the container or build a custom container image that contains the class. +These configuration options are only relevant if you're using a custom secrets store backend implementation. For this to work, you must have [a custom implementation of the secrets store API](custom-secret-stores.md) in the form of a class derived from `zenml.zen_stores.secrets_stores.base_secrets_store.BaseSecretsStore`. This class must be importable from within the ZenML server container, which means you most likely need to mount the directory containing the class into the container or build a custom container image that contains the class. The following configuration option is required: diff --git a/docs/book/how-to/contribute-to-zenml/implement-a-custom-integration.md b/docs/book/how-to/contribute-to-zenml/implement-a-custom-integration.md index d23466ac7c7..dd0a9020c8a 100644 --- a/docs/book/how-to/contribute-to-zenml/implement-a-custom-integration.md +++ b/docs/book/how-to/contribute-to-zenml/implement-a-custom-integration.md @@ -6,13 +6,13 @@ description: Creating an external integration and contributing to ZenML ![ZenML integrates with a number of tools from the MLOps landscape](../../../.gitbook/assets/sam-side-by-side-full-text.png) -One of the main goals of ZenML is to find some semblance of order in the ever-growing MLOps landscape. ZenML already provides [numerous integrations](https://zenml.io/integrations) into many popular tools, and allows you to come up with ways to [implement your own stack component flavors](implement-a-custom-stack-component.md) in order to fill in any gaps that are remaining. +One of the main goals of ZenML is to find some semblance of order in the ever-growing MLOps landscape. ZenML already provides [numerous integrations](https://zenml.io/integrations) into many popular tools, and allows you to come up with ways to [implement your own stack component flavors](../infrastructure-deployment/stack-deployment/implement-a-custom-stack-component.md) in order to fill in any gaps that are remaining. _However, what if you want to make your extension of ZenML part of the main codebase, to share it with others?_ If you are such a person, e.g., a tooling provider in the ML/MLOps space, or just want to contribute a tooling integration to ZenML, this guide is intended for you. ### Step 1: Plan out your integration -In [the previous page](implement-a-custom-stack-component.md), we looked at the categories and abstractions that core ZenML defines. In order to create a new integration into ZenML, you would need to first find the categories that your integration belongs to. The list of categories can be found [here](../../component-guide/README.md) as well. +In [the previous page](../infrastructure-deployment/stack-deployment/implement-a-custom-stack-component.md), we looked at the categories and abstractions that core ZenML defines. In order to create a new integration into ZenML, you would need to first find the categories that your integration belongs to. The list of categories can be found [here](../../component-guide/README.md) as well. Note that one integration may belong to different categories: For example, the cloud integrations (AWS/GCP/Azure) contain [container registries](../../component-guide/container-registries/container-registries.md), [artifact stores](../../component-guide/artifact-stores/artifact-stores.md) etc. diff --git a/docs/book/how-to/control-logging/view-logs-on-the-dasbhoard.md b/docs/book/how-to/control-logging/view-logs-on-the-dasbhoard.md index 2b803a6d4f5..9698fffabda 100644 --- a/docs/book/how-to/control-logging/view-logs-on-the-dasbhoard.md +++ b/docs/book/how-to/control-logging/view-logs-on-the-dasbhoard.md @@ -19,7 +19,7 @@ These logs are stored within the respective artifact store of your stack. This m * In case of a local ZenML server (via `zenml login --local`), both local and remote artifact stores may be accessible, depending on configuration of the client. * In case of a deployed ZenML server, logs for runs on a [local artifact store](../../component-guide/artifact-stores/local.md) will not be accessible. Logs for runs using a [remote artifact store](../../user-guide/production-guide/remote-storage.md) **may be** accessible, if the artifact store has been configured -with a [service connector](../../infrastructure-deployment/auth-management/service-connectors-guide.md). Please read [this chapter](../../user-guide/production-guide/remote-storage.md) of +with a [service connector](../../how-to/infrastructure-deployment/auth-management/service-connectors-guide.md). Please read [this chapter](../../user-guide/production-guide/remote-storage.md) of the production guide to learn how to configure a remote artifact store with a service connector. If configured correctly, the logs are displayed in the dashboard as follows: diff --git a/docs/book/how-to/customize-docker-builds/define-where-an-image-is-built.md b/docs/book/how-to/customize-docker-builds/define-where-an-image-is-built.md index 552af1fc612..e712f69c27c 100644 --- a/docs/book/how-to/customize-docker-builds/define-where-an-image-is-built.md +++ b/docs/book/how-to/customize-docker-builds/define-where-an-image-is-built.md @@ -8,10 +8,10 @@ ZenML executes pipeline steps sequentially in the active Python environment when By default, execution environments are created locally in the client environment using the local Docker client. However, this requires Docker installation and permissions. ZenML offers [image builders](../../component-guide/image-builders/image-builders.md), a special [stack component](../../component-guide/README.md), allowing users to build and push Docker images in a different specialized _image builder environment_. -Note that even if you don't configure an image builder in your stack, ZenML still uses the [local image builder](../../../component-guide/image-builders/local.md) to retain consistency across all builds. In this case, the image builder environment is the same as the [client environment](../pipeline-development/configure-python-environments/README.md#client-environment-or-the-runner-environment). +Note that even if you don't configure an image builder in your stack, ZenML still uses the [local image builder](../../component-guide/image-builders/local.md) to retain consistency across all builds. In this case, the image builder environment is the same as the [client environment](../pipeline-development/configure-python-environments/README.md#client-environment-or-the-runner-environment). You don't need to directly interact with any image builder in your code. As long as the image builder that you want to -use is part of your active [ZenML stack](/docs/book/user-guide/production-guide/understand-stacks.md), it will be used +use is part of your active [ZenML stack](../../user-guide/production-guide/understand-stacks.md), it will be used automatically by any component that needs to build container images.
ZenML Scarf
diff --git a/docs/book/how-to/data-artifact-management/complex-usecases/datasets.md b/docs/book/how-to/data-artifact-management/complex-usecases/datasets.md index 32fc8ef2b99..665f8e12de3 100644 --- a/docs/book/how-to/data-artifact-management/complex-usecases/datasets.md +++ b/docs/book/how-to/data-artifact-management/complex-usecases/datasets.md @@ -64,7 +64,7 @@ class BigQueryDataset(Dataset): ## Creating Custom Materializers -[Materializers](./handle-custom-data-types.md) in ZenML handle the serialization and deserialization of artifacts. Custom Materializers are essential for working with custom Dataset classes: +[Materializers](../../../how-to/data-artifact-management/handle-data-artifacts/handle-custom-data-types.md) in ZenML handle the serialization and deserialization of artifacts. Custom Materializers are essential for working with custom Dataset classes: ```python from typing import Type diff --git a/docs/book/how-to/data-artifact-management/complex-usecases/passing-artifacts-between-pipelines.md b/docs/book/how-to/data-artifact-management/complex-usecases/passing-artifacts-between-pipelines.md index a8b40e4f6c5..fbcd1418f6a 100644 --- a/docs/book/how-to/data-artifact-management/complex-usecases/passing-artifacts-between-pipelines.md +++ b/docs/book/how-to/data-artifact-management/complex-usecases/passing-artifacts-between-pipelines.md @@ -47,7 +47,7 @@ def training_pipeline(): ``` {% hint style="info" %} -Note that in the above example, the `train_data` and `test_data` artifacts are not [materialized](artifact-versioning.md) in memory in the `@pipeline` function, but rather the `train_data` and `test_data` objects are simply references to where this data is stored in the artifact store. Therefore, one cannot use any logic regarding the nature of this data itself during compilation time (i.e. in the `@pipeline` function). +Note that in the above example, the `train_data` and `test_data` artifacts are not [materialized](../../../how-to/data-artifact-management/handle-data-artifacts/artifact-versioning.md) in memory in the `@pipeline` function, but rather the `train_data` and `test_data` objects are simply references to where this data is stored in the artifact store. Therefore, one cannot use any logic regarding the nature of this data itself during compilation time (i.e. in the `@pipeline` function). {% endhint %} ## Pattern 2: Artifact exchange between pipelines through a `Model` diff --git a/docs/book/how-to/data-artifact-management/complex-usecases/unmaterialized-artifacts.md b/docs/book/how-to/data-artifact-management/complex-usecases/unmaterialized-artifacts.md index a8d6db9a9f3..c328d44f769 100644 --- a/docs/book/how-to/data-artifact-management/complex-usecases/unmaterialized-artifacts.md +++ b/docs/book/how-to/data-artifact-management/complex-usecases/unmaterialized-artifacts.md @@ -6,7 +6,7 @@ description: Skip materialization of artifacts. A ZenML pipeline is built in a data-centric way. The outputs and inputs of steps define how steps are connected and the order in which they are executed. Each step should be considered as its very own process that reads and writes its inputs and outputs from and to the [artifact store](../../../component-guide/artifact-stores/artifact-stores.md). This is where **materializers** come into play. -A materializer dictates how a given artifact can be written to and retrieved from the artifact store and also contains all serialization and deserialization logic. Whenever you pass artifacts as outputs from one pipeline step to other steps as inputs, the corresponding materializer for the respective data type defines how this artifact is first serialized and written to the artifact store, and then deserialized and read in the next step. Read more about this [here](handle-custom-data-types.md). +A materializer dictates how a given artifact can be written to and retrieved from the artifact store and also contains all serialization and deserialization logic. Whenever you pass artifacts as outputs from one pipeline step to other steps as inputs, the corresponding materializer for the respective data type defines how this artifact is first serialized and written to the artifact store, and then deserialized and read in the next step. Read more about this [here](../../../how-to/data-artifact-management/handle-data-artifacts/handle-custom-data-types.md). However, there are instances where you might **not** want to materialize an artifact in a step, but rather use a reference to it instead. This is where skipping materialization comes in. diff --git a/docs/book/how-to/handle-data-artifacts/visualize-artifacts.md b/docs/book/how-to/handle-data-artifacts/visualize-artifacts.md deleted file mode 100644 index 1c4f3ef5a71..00000000000 --- a/docs/book/how-to/handle-data-artifacts/visualize-artifacts.md +++ /dev/null @@ -1,153 +0,0 @@ ---- -description: Configuring ZenML to display data visualizations in the dashboard. ---- - -# Visualize artifacts - -ZenML automatically saves visualizations of many common data types and allows you to view these visualizations in the ZenML dashboard: - -![ZenML Artifact Visualizations](../../.gitbook/assets/artifact_visualization_dashboard.png) - -Alternatively, any of these visualizations can also be displayed in Jupyter notebooks using the `artifact.visualize()` method: - -![output.visualize() Output](../../.gitbook/assets/artifact_visualization_evidently.png) - -Currently, the following visualization types are supported: - -* **HTML:** Embedded HTML visualizations such as data validation reports, -* **Image:** Visualizations of image data such as Pillow images or certain numeric numpy arrays, -* **CSV:** Tables, such as the pandas DataFrame `.describe()` output, -* **Markdown:** Markdown strings or pages. - -## Giving the ZenML Server Access to Visualizations - -In order for the visualizations to show up on the dashboard, the following must be true: - -### Configuring a Service Connector - -Visualizations are usually stored alongside the artifact, in the [artifact store](../../component-guide/artifact-stores/artifact-stores.md). Therefore, if a user would like to see the visualization displayed on the ZenML dashboard, they must give access to the server to connect to the artifact store. - -The [service connector](../auth-management/) documentation goes deeper into the concept of service connectors and how they can be configured to give the server permission to access the artifact store. For a concrete example, see the [AWS S3](../../component-guide/artifact-stores/s3.md) artifact store documentation. - -{% hint style="info" %} -When using the default/local artifact store with a deployed ZenML, the server naturally does not have access to your local files. In this case, the visualizations are also not displayed on the dashboard. - -Please use a service connector enabled and remote artifact store alongside a deployed ZenML to view visualizations. -{% endhint %} - -### Configuring Artifact Stores - -If all visualizations of a certain pipeline run are not showing up in the dashboard, it might be that your ZenML server does not have the required dependencies or permissions to access that artifact store. See the [custom artifact store docs page](../../component-guide/artifact-stores/custom.md#enabling-artifact-visualizations-with-custom-artifact-stores) for more information. - -## Creating Custom Visualizations - -There are two ways how you can add custom visualizations to the dashboard: - -* If you are already handling HTML, Markdown, or CSV data in one of your steps, you can have them visualized in just a few lines of code by casting them to a [special class](visualize-artifacts.md#visualization-via-special-return-types) inside your step. -* If you want to automatically extract visualizations for all artifacts of a certain data type, you can define type-specific visualization logic by [building a custom materializer](visualize-artifacts.md#visualization-via-materializers). -* If you want to create any other custom visualizations, you can [create a custom return type class with corresponding materializer](visualize-artifacts.md#visualization-via-custom-return-type-and-materializer) and build and return this custom return type from one of your steps. - -### Visualization via Special Return Types - -If you already have HTML, Markdown, or CSV data available as a string inside your step, you can simply cast them to one of the following types and return them from your step: - -* `zenml.types.HTMLString` for strings in HTML format, e.g., `"

Header

Some text"`, -* `zenml.types.MarkdownString` for strings in Markdown format, e.g., `"# Header\nSome text"`, -* `zenml.types.CSVString` for strings in CSV format, e.g., `"a,b,c\n1,2,3"`. - -#### Example: - -```python -from zenml.types import CSVString - -@step -def my_step() -> CSVString: - some_csv = "a,b,c\n1,2,3" - return CSVString(some_csv) -``` - -This would create the following visualization in the dashboard: - -![CSV Visualization Example](../../.gitbook/assets/artifact\_visualization\_csv.png) - -### Visualization via Materializers - -If you want to automatically extract visualizations for all artifacts of a certain data type, you can do so by overriding the `save_visualizations()` method of the corresponding materializer. See the [materializer docs page](handle-custom-data-types.md#optional-how-to-visualize-the-artifact) for more information on how to create custom materializers that do this. - -### Visualization via Custom Return Type and Materializer - -By combining the ideas behind the above two visualization approaches, you can visualize virtually anything you want inside your ZenML dashboard in three simple steps: - -1. Create a **custom class** that will hold the data that you want to visualize. -2. [Build a custom **materializer**](handle-custom-data-types.md#custom-materializers) for this custom class with the visualization logic implemented in the `save_visualizations()` method. -3. Return your custom class from any of your ZenML steps. - -#### Example: Facets Data Skew Visualization - -As an example, have a look at the models, materializers, and steps of the [Facets Integration](https://sdkdocs.zenml.io/latest/integration\_code\_docs/integrations-facets), which can be used to visualize the data skew between multiple Pandas DataFrames: - -![Facets Visualization](../../.gitbook/assets/facets-visualization.png) - -**1. Custom Class** The [FacetsComparison](https://sdkdocs.zenml.io/0.42.0/integration\_code\_docs/integrations-facets/#zenml.integrations.facets.models.FacetsComparison) is the custom class that holds the data required for the visualization. - -```python -class FacetsComparison(BaseModel): - datasets: List[Dict[str, Union[str, pd.DataFrame]]] -``` - -**2. Materializer** The [FacetsMaterializer](https://sdkdocs.zenml.io/0.42.0/integration\_code\_docs/integrations-facets/#zenml.integrations.facets.materializers.facets\_materializer.FacetsMaterializer) is a custom materializer that only handles this custom class and contains the corresponding visualization logic. - -```python -class FacetsMaterializer(BaseMaterializer): - - ASSOCIATED_TYPES = (FacetsComparison,) - ASSOCIATED_ARTIFACT_TYPE = ArtifactType.DATA_ANALYSIS - - def save_visualizations( - self, data: FacetsComparison - ) -> Dict[str, VisualizationType]: - html = ... # Create a visualization for the custom type - visualization_path = os.path.join(self.uri, VISUALIZATION_FILENAME) - with fileio.open(visualization_path, "w") as f: - f.write(html) - return {visualization_path: VisualizationType.HTML} -``` - -**3. Step** There are three different steps in the `facets` integration that can be used to create `FacetsComparison`s for different sets of inputs. E.g., the `facets_visualization_step` below takes two DataFrames as inputs and builds a `FacetsComparison` object out of them: - -```python -@step -def facets_visualization_step( - reference: pd.DataFrame, comparison: pd.DataFrame -) -> FacetsComparison: # Return the custom type from your step - return FacetsComparison( - datasets=[ - {"name": "reference", "table": reference}, - {"name": "comparison", "table": comparison}, - ] - ) -``` - -{% hint style="info" %} -This is what happens now under the hood when you add the `facets_visualization_step` into your pipeline: - -1. The step creates and returns a `FacetsComparison`. -2. When the step finishes, ZenML will search for a materializer class that can handle this type, finds the `FacetsMaterializer`, and calls the `save_visualizations()` method which creates the visualization and saves it into your artifact store as an HTML file. -3. When you open your dashboard and click on the artifact inside the run DAG, the visualization HTML file is loaded from the artifact store and displayed. -{% endhint %} - -## Disabling Visualizations - -If you would like to disable artifact visualization altogether, you can set `enable_artifact_visualization` at either pipeline or step level: - -```python -@step(enable_artifact_visualization=False) -def my_step(): - ... - -@pipeline(enable_artifact_visualization=False) -def my_pipeline(): - ... -``` - -
ZenML Scarf
diff --git a/docs/book/how-to/infrastructure-deployment/infrastructure-as-code/terraform-stack-management.md b/docs/book/how-to/infrastructure-deployment/infrastructure-as-code/terraform-stack-management.md index fb2f6f402fb..74c88300f3a 100644 --- a/docs/book/how-to/infrastructure-deployment/infrastructure-as-code/terraform-stack-management.md +++ b/docs/book/how-to/infrastructure-deployment/infrastructure-as-code/terraform-stack-management.md @@ -71,12 +71,12 @@ zenml service-account create ``` You can learn more about how to generate a `ZENML_API_KEY` via service accounts -[here](../../project-setup-and-management/connecting-to-zenml/connect-with-a-service-account.md). +[here](../../../how-to/manage-zenml-server/connecting-to-zenml/connect-with-a-service-account.md). ### Create the service connectors The key to successful registration is proper authentication between the components. -[Service connectors](../auth-management/README.md) are ZenML's way of managing this: +[Service connectors](../../../how-to/infrastructure-deployment/auth-management/README.md) are ZenML's way of managing this: ```hcl # First, create a service connector diff --git a/docs/book/how-to/infrastructure-deployment/stack-deployment/deploy-a-cloud-stack-with-terraform.md b/docs/book/how-to/infrastructure-deployment/stack-deployment/deploy-a-cloud-stack-with-terraform.md index c9734c349e4..a9ea42082e3 100644 --- a/docs/book/how-to/infrastructure-deployment/stack-deployment/deploy-a-cloud-stack-with-terraform.md +++ b/docs/book/how-to/infrastructure-deployment/stack-deployment/deploy-a-cloud-stack-with-terraform.md @@ -42,7 +42,7 @@ If you prefer to host your own, you can learn about self-hosting a ZenML server Once you are connected to your deployed ZenML server, you need to create a service account and an API key for it. You will use the API key to give the Terraform module programmatic access to your ZenML server. You can find more -about service accounts and API keys [here](../../project-setup-and-management/connecting-to-zenml/connect-with-a-service-account.md). +about service accounts and API keys [here](../../../how-to/manage-zenml-server/connecting-to-zenml/connect-with-a-service-account.md). but the process is as simple as running the following CLI command while connected to your ZenML server: diff --git a/docs/book/how-to/infrastructure-deployment/stack-deployment/implement-a-custom-stack-component.md b/docs/book/how-to/infrastructure-deployment/stack-deployment/implement-a-custom-stack-component.md index f70d20ecf98..0fb9cb584c2 100644 --- a/docs/book/how-to/infrastructure-deployment/stack-deployment/implement-a-custom-stack-component.md +++ b/docs/book/how-to/infrastructure-deployment/stack-deployment/implement-a-custom-stack-component.md @@ -145,7 +145,7 @@ class MyS3ArtifactStoreConfig(BaseArtifactStoreConfig): ``` {% hint style="info" %} -You can pass sensitive configuration values as [secrets](../../interact-with-secrets.md) by defining them as type `SecretField` in the configuration class. +You can pass sensitive configuration values as [secrets](../../../how-to/project-setup-and-management/interact-with-secrets.md) by defining them as type `SecretField` in the configuration class. {% endhint %} With the configuration defined, we can move on to the implementation class, which will use the S3 file system to implement the abstract methods of the `BaseArtifactStore`: diff --git a/docs/book/how-to/infrastructure-deployment/stack-deployment/reference-secrets-in-stack-configuration.md b/docs/book/how-to/infrastructure-deployment/stack-deployment/reference-secrets-in-stack-configuration.md index 9f61610e177..b17109c63e4 100644 --- a/docs/book/how-to/infrastructure-deployment/stack-deployment/reference-secrets-in-stack-configuration.md +++ b/docs/book/how-to/infrastructure-deployment/stack-deployment/reference-secrets-in-stack-configuration.md @@ -42,7 +42,7 @@ You can use the environment variable `ZENML_SECRET_VALIDATION_LEVEL` to disable ### Fetch secret values in a step -If you are using [centralized secrets management](../../interact-with-secrets.md), you can access secrets directly from within your steps through the ZenML `Client` API. This allows you to use your secrets for querying APIs from within your step without hard-coding your access keys: +If you are using [centralized secrets management](../../../how-to/project-setup-and-management/interact-with-secrets.md), you can access secrets directly from within your steps through the ZenML `Client` API. This allows you to use your secrets for querying APIs from within your step without hard-coding your access keys: ```python from zenml import step @@ -66,7 +66,7 @@ def secret_loader() -> None: ## See Also -- [Interact with secrets](../../interact-with-secrets.md): Learn how to create, +- [Interact with secrets](../../../how-to/project-setup-and-management/interact-with-secrets.md): Learn how to create, list, and delete secrets using the ZenML CLI and Python SDK.
ZenML Scarf
diff --git a/docs/book/how-to/manage-the-zenml-server/migration-guide/README.md b/docs/book/how-to/manage-the-zenml-server/migration-guide/README.md deleted file mode 100644 index 0559eded96c..00000000000 --- a/docs/book/how-to/manage-the-zenml-server/migration-guide/README.md +++ /dev/null @@ -1,27 +0,0 @@ ---- -description: How to migrate your ZenML code to the newest version. ---- - -# ♻ Migration guide - -Migrations are necessary for ZenML releases that include breaking changes, which are currently all releases that increment the minor version of the release, e.g., `0.X` -> `0.Y`. Furthermore, all releases that increment the first non-zero digit of the version contain major breaking changes or paradigm shifts that are explained in separate migration guides below. - -## Release Type Examples - -* `0.40.2` to `0.40.3` contains _no breaking changes_ and requires no migration whatsoever, -* `0.40.3` to `0.41.0` contains _minor breaking changes_ that need to be taken into account when upgrading ZenML, -* `0.39.1` to `0.40.0` contains _major breaking changes_ that introduce major shifts in how ZenML code is written or used. - -## Major Migration Guides - -The following guides contain detailed instructions on how to migrate between ZenML versions that introduced major breaking changes or paradigm shifts. The migration guides are sequential, meaning if there is more than one migration guide between your current version and the latest release, follow each guide in order. - -* [Migration guide 0.13.2 → 0.20.0](migration-zero-twenty.md) -* [Migration guide 0.23.0 → 0.30.0](migration-zero-thirty.md) -* [Migration guide 0.39.1 → 0.41.0](migration-zero-forty.md) - -## Release Notes - -For releases with minor breaking changes, e.g., `0.40.3` to `0.41.0`, check out the official [ZenML Release Notes](https://github.com/zenml-io/zenml/releases) to see which breaking changes were introduced. - -
ZenML Scarf
diff --git a/docs/book/how-to/manage-zenml-server/migration-guide/migration-zero-twenty.md b/docs/book/how-to/manage-zenml-server/migration-guide/migration-zero-twenty.md index e44d4a54a6c..9b0b1ad3987 100644 --- a/docs/book/how-to/manage-zenml-server/migration-guide/migration-zero-twenty.md +++ b/docs/book/how-to/manage-zenml-server/migration-guide/migration-zero-twenty.md @@ -28,7 +28,7 @@ ZenML can now run [as a server](../../../getting-started/core-concepts.md#zenml- The release introduces a series of commands to facilitate managing the lifecycle of the ZenML server and to access the pipeline and pipeline run information: -* `zenml connect / disconnect / down / up / logs / status` can be used to configure your client to connect to a ZenML server, to start a local ZenML Dashboard or to deploy a ZenML server to a cloud environment. For more information on how to use these commands, see [the ZenML deployment documentation](../../../../user-guide/production-guide/deploying-zenml.md). +* `zenml connect / disconnect / down / up / logs / status` can be used to configure your client to connect to a ZenML server, to start a local ZenML Dashboard or to deploy a ZenML server to a cloud environment. For more information on how to use these commands, see [the ZenML deployment documentation](../../../getting-started/deploying-zenml/README.md). * `zenml pipeline list / runs / delete` can be used to display information and about and manage your pipelines and pipeline runs. In ZenML 0.13.2 and earlier versions, information about pipelines and pipeline runs used to be stored in a separate stack component called the Metadata Store. Starting with 0.20.0, the role of the Metadata Store is now taken over by ZenML itself. This means that the Metadata Store is no longer a separate component in the ZenML architecture, but rather a part of the ZenML core, located wherever ZenML is deployed: locally on your machine or running remotely as a server. @@ -47,9 +47,9 @@ If you're already using ZenML, aside from the above limitation, this change will * if you're using the default `sqlite` Metadata Store flavor in your stacks, you don't need to do anything. ZenML will automatically switch to using its local database instead of your `sqlite` Metadata Stores when you update to 0.20.0 (also see how to [migrate your stacks](migration-zero-twenty.md#-how-to-migrate-your-profiles)). * if you're using the `kubeflow` Metadata Store flavor _only as a way to connect to the local Kubeflow Metadata Service_ (i.e. the one installed by the `kubeflow` Orchestrator in a local k3d Kubernetes cluster), you also don't need to do anything explicitly. When you [migrate your stacks](migration-zero-twenty.md#-how-to-migrate-your-profiles) to ZenML 0.20.0, ZenML will automatically switch to using its local database. -* if you're using the `kubeflow` Metadata Store flavor to connect to a remote Kubeflow Metadata Service such as those provided by a Kubeflow installation running in AWS, Google or Azure, there is currently no equivalent in ZenML 0.20.0. You'll need to [deploy a ZenML Server](../../user-guide/getting-started/deploying-zenml/deploying-zenml.md) instance close to where your Kubeflow service is running (e.g. in the same cloud region). -* if you're using the `mysql` Metadata Store flavor to connect to a remote MySQL database service (e.g. a managed AWS, GCP or Azure MySQL service), you'll have to [deploy a ZenML Server](../../user-guide/getting-started/deploying-zenml/deploying-zenml.md) instance connected to that same database. -* if you deployed a `kubernetes` Metadata Store flavor (i.e. a MySQL database service deployed in Kubernetes), you can [deploy a ZenML Server](../../user-guide/getting-started/deploying-zenml/deploying-zenml.md) in the same Kubernetes cluster and connect it to that same database. However, ZenML will no longer provide the `kubernetes` Metadata Store flavor and you'll have to manage the Kubernetes MySQL database service deployment yourself going forward. +* if you're using the `kubeflow` Metadata Store flavor to connect to a remote Kubeflow Metadata Service such as those provided by a Kubeflow installation running in AWS, Google or Azure, there is currently no equivalent in ZenML 0.20.0. You'll need to [deploy a ZenML Server](../../../getting-started/deploying-zenml/README.md) instance close to where your Kubeflow service is running (e.g. in the same cloud region). +* if you're using the `mysql` Metadata Store flavor to connect to a remote MySQL database service (e.g. a managed AWS, GCP or Azure MySQL service), you'll have to [deploy a ZenML Server](../../../getting-started/deploying-zenml/README.md) instance connected to that same database. +* if you deployed a `kubernetes` Metadata Store flavor (i.e. a MySQL database service deployed in Kubernetes), you can [deploy a ZenML Server](../../../getting-started/deploying-zenml/README.md) in the same Kubernetes cluster and connect it to that same database. However, ZenML will no longer provide the `kubernetes` Metadata Store flavor and you'll have to manage the Kubernetes MySQL database service deployment yourself going forward. {% hint style="info" %} The ZenML Server inherits the same limitations that the Metadata Store had prior to ZenML 0.20.0: @@ -69,7 +69,7 @@ The `zenml pipeline runs migrate` CLI command is only available under ZenML vers To migrate the pipeline run information already stored in an existing metadata store to the new ZenML paradigm, you can use the `zenml pipeline runs migrate` CLI command. 1. Before upgrading ZenML, make a backup of all metadata stores you want to migrate, then upgrade ZenML. -2. Decide the ZenML deployment model that you want to follow for your projects. See the [ZenML deployment documentation](../../user-guide/getting-started/deploying-zenml/deploying-zenml.md) for available deployment scenarios. If you decide on using a local or remote ZenML server to manage your pipelines, make sure that you first connect your client to it by running `zenml connect`. +2. Decide the ZenML deployment model that you want to follow for your projects. See the [ZenML deployment documentation](../../../getting-started/deploying-zenml/README.md) for available deployment scenarios. If you decide on using a local or remote ZenML server to manage your pipelines, make sure that you first connect your client to it by running `zenml connect`. 3. Use the `zenml pipeline runs migrate` CLI command to migrate your old pipeline runs: * If you want to migrate from a local SQLite metadata store, you only need to pass the path to the metadata store to the command, e.g.: @@ -126,7 +126,7 @@ The Dashboard will be available at `http://localhost:8237` by default: ![ZenML Dashboard Preview](../../user-guide/assets/migration/zenml-dashboard.png) -For more details on other possible deployment options, see the [ZenML deployment documentation](../../user-guide/getting-started/deploying-zenml/deploying-zenml.md), and/or follow the [starter guide](../../user-guide/starter-guide/pipelines/pipelines.md) to learn more. +For more details on other possible deployment options, see the [ZenML deployment documentation](../../../getting-started/deploying-zenml/README.md), and/or follow the [starter guide](../../../user-guide/starter-guide/README.md) to learn more. ## Removal of Profiles and the local YAML database @@ -143,7 +143,7 @@ Since the local YAML database is no longer used by ZenML 0.20.0, you will lose a If you're already using ZenML, you can migrate your existing Profiles to the new ZenML 0.20.0 paradigm by following these steps: 1. first, update ZenML to 0.20.0. This will automatically invalidate all your existing Profiles. -2. decide the ZenML deployment model that you want to follow for your projects. See the [ZenML deployment documentation](../../user-guide/getting-started/deploying-zenml/deploying-zenml.md) for available deployment scenarios. If you decide on using a local or remote ZenML server to manage your pipelines, make sure that you first connect your client to it by running `zenml connect`. +2. decide the ZenML deployment model that you want to follow for your projects. See the [ZenML deployment documentation](../../../getting-started/deploying-zenml/README.md) for available deployment scenarios. If you decide on using a local or remote ZenML server to manage your pipelines, make sure that you first connect your client to it by running `zenml connect`. 3. use the `zenml profile list` and `zenml profile migrate` CLI commands to import the Stacks and Stack Components from your Profiles into your new ZenML deployment. If you have multiple Profiles that you would like to migrate, you can either use a prefix for the names of your imported Stacks and Stack Components, or you can use a different ZenML Project for each Profile. {% hint style="warning" %} @@ -281,7 +281,7 @@ The `zenml profile migrate` CLI command also provides command line flags for cas Stack components can now be registered without having the required integrations installed. As part of this change, we split all existing stack component definitions into three classes: an implementation class that defines the logic of the stack component, a config class that defines the attributes and performs input validations, and a flavor class that links implementation and config classes together. See [**component flavor models #895**](https://github.com/zenml-io/zenml/pull/895) for more details. -If you are only using stack component flavors that are shipped with the zenml Python distribution, this change has no impact on the configuration of your existing stacks. However, if you are currently using custom stack component implementations, you will need to update them to the new format. See the [documentation on writing custom stack component flavors](../../how-to/infrastructure-deployment/stack-deployment/implement-a-custom-stack-component.md) for updated information on how to do this. +If you are only using stack component flavors that are shipped with the zenml Python distribution, this change has no impact on the configuration of your existing stacks. However, if you are currently using custom stack component implementations, you will need to update them to the new format. See the [documentation on writing custom stack component flavors](../../../how-to/infrastructure-deployment/stack-deployment/implement-a-custom-stack-component.md) for updated information on how to do this. ## Shared ZenML Stacks and Stack Components @@ -339,7 +339,7 @@ With ZenML 0.20.0, we introduce the `BaseSettings` class, a broad class that ser Pipelines and steps now allow all configurations on their decorators as well as the `.configure(...)` method. This includes configurations for stack components that are not infrastructure-related which was previously done using the `@enable_xxx` decorators). The same configurations can also be defined in a YAML file. -Read more about this paradigm in the [new docs section about settings](../../how-to/pipeline-development/use-configuration-files/what-can-be-configured.md). +Read more about this paradigm in the [new docs section about settings](../../../how-to/pipeline-development/use-configuration-files/what-can-be-configured.md). Here is a list of changes that are the most obvious in consequence of the above code. Please note that this list is not exhaustive, and if we have missed something let us know via [Slack](https://zenml.io/slack). @@ -389,7 +389,7 @@ def my_step() -> None: With this change, all stack components (e.g. Orchestrators and Step Operators) that accepted a `docker_parent_image` as part of its Stack Configuration should now pass it through the `DockerSettings` object. -Read more [here](../../user-guide/starter-guide/production-fundamentals/containerization.md). +Read more [here](../../../how-to/customize-docker-builds/docker-settings-on-a-pipeline.md). **`ResourceConfiguration` is now renamed to `ResourceSettings`** @@ -417,23 +417,23 @@ def my_step() -> None: ... ``` -Read more [here](../../user-guide/starter-guide/production-fundamentals/containerization.md). +Read more [here](../../../how-to/customize-docker-builds/README.md). **A new pipeline intermediate representation** All the aforementioned configurations as well as additional information required to run a ZenML pipelines are now combined into an intermediate representation called `PipelineDeployment`. Instead of the user-facing `BaseStep` and `BasePipeline` classes, all the ZenML orchestrators and step operators now use this intermediate representation to run pipelines and steps. -**How to migrate**: If you have written a [custom orchestrator](../../user-guide/component-gallery/orchestrators/custom.md) or [step operator](../../user-guide/component-gallery/step-operators/custom.md), then you should see the new base abstractions (seen in the links). You can adjust your stack component implementations accordingly. +**How to migrate**: If you have written a [custom orchestrator](../../../how-to/infrastructure-deployment/stack-deployment/implement-a-custom-stack-component.md) or [step operator](../../../how-to/infrastructure-deployment/stack-deployment/implement-a-custom-stack-component.md), then you should see the new base abstractions (seen in the links). You can adjust your stack component implementations accordingly. ### `PipelineSpec` now uniquely defines pipelines Once a pipeline has been executed, it is represented by a `PipelineSpec` that uniquely identifies it. Therefore, users are no longer able to edit a pipeline once it has been run once. There are now three options to get around this: -* Pipeline runs can be created without being associated with a pipeline explicitly: We call these `unlisted` runs. Read more about unlisted runs [here](../../user-guide/starter-guide/pipelines/pipelines.md#unlisted-runs). +* Pipeline runs can be created without being associated with a pipeline explicitly: We call these `unlisted` runs. Read more about unlisted runs [here](../../../how-to/pipeline-development/parameters-and-caching.md#unlisted-runs). * Pipelines can be deleted and created again. * Pipelines can be given unique names each time they are run to uniquely identify them. -**How to migrate**: No code changes, but rather keep in mind the behavior (e.g. in a notebook setting) when quickly [iterating over pipelines as experiments](../../user-guide/starter-guide/pipelines/parameters-and-caching.md). +**How to migrate**: No code changes, but rather keep in mind the behavior (e.g. in a notebook setting) when quickly [iterating over pipelines as experiments](../../../how-to/pipeline-development/build-pipelines/use-pipeline-step-parameters.md). ### New post-execution workflow @@ -447,7 +447,7 @@ from zenml.post_execution import get_pipelines, get_pipeline * New methods to directly get a run have been introduced: `get_run` and `get_unlisted_runs` method has been introduced to get unlisted runs. -Usage remains largely similar. Please read the [new docs for post-execution](../../user-guide/starter-guide/pipelines/fetching-pipelines.md) to inform yourself of what further has changed. +Usage remains largely similar. Please read the [new docs for post-execution](../../../how-to/pipeline-development/build-pipelines/fetching-pipelines.md) to inform yourself of what further has changed. **How to migrate**: Replace all post-execution workflows from the paradigm of `Repository.get_pipelines` or `Repository.get_pipeline_run` to the corresponding post\_execution methods. diff --git a/docs/book/how-to/pipeline-development/build-pipelines/fetching-pipelines.md b/docs/book/how-to/pipeline-development/build-pipelines/fetching-pipelines.md index f4fb261eba4..5a57783dcd7 100644 --- a/docs/book/how-to/pipeline-development/build-pipelines/fetching-pipelines.md +++ b/docs/book/how-to/pipeline-development/build-pipelines/fetching-pipelines.md @@ -260,7 +260,7 @@ output.visualize() ![output.visualize() Output](../../../.gitbook/assets/artifact\_visualization\_evidently.png) {% hint style="info" %} -If you're not in a Jupyter notebook, you can simply view the visualizations in the ZenML dashboard by running `zenml login --local` and clicking on the respective artifact in the pipeline run DAG instead. Check out the [artifact visualization page](../../handle-data-artifacts/visualize-artifacts.md) to learn more about how to build and view artifact visualizations in ZenML! +If you're not in a Jupyter notebook, you can simply view the visualizations in the ZenML dashboard by running `zenml login --local` and clicking on the respective artifact in the pipeline run DAG instead. Check out the [artifact visualization page](../../../how-to/data-artifact-management/visualize-artifacts/README.md) to learn more about how to build and view artifact visualizations in ZenML! {% endhint %} ## Fetching information during run execution diff --git a/docs/book/how-to/pipeline-development/build-pipelines/hyper-parameter-tuning.md b/docs/book/how-to/pipeline-development/build-pipelines/hyper-parameter-tuning.md index 49f8ae72a3e..89d1d699bb9 100644 --- a/docs/book/how-to/pipeline-development/build-pipelines/hyper-parameter-tuning.md +++ b/docs/book/how-to/pipeline-development/build-pipelines/hyper-parameter-tuning.md @@ -27,7 +27,7 @@ This is an implementation of a basic grid search (across a single dimension) tha See it in action with the E2E example -_To set up the local environment used below, follow the recommendations from the_ [_Project templates_](../../project-setup-and-management/setting-up-a-project-repository/using-project-templates.md)_._ +_To set up the local environment used below, follow the recommendations from the_ [_Project templates_](../../../how-to/project-setup-and-management/collaborate-with-team/project-templates/README.md)_._ In [`pipelines/training.py`](../../../../examples/e2e/pipelines/training.py), you will find a training pipeline with a `Hyperparameter tuning stage` section. It contains a `for` loop that runs the `hp_tuning_single_search` over the configured model search spaces, followed by the `hp_tuning_select_best_model` being executed after all search steps are completed. As a result, we are getting `best_model_config` to be used to train the best possible model later on. @@ -88,7 +88,7 @@ def select_model_step(): See it in action with the E2E example -_To set up the local environment used below, follow the recommendations from the_ [_Project templates_](../../project-setup-and-management/setting-up-a-project-repository/using-project-templates.md)_._ +_To set up the local environment used below, follow the recommendations from the_ [_Project templates_](../../../how-to/project-setup-and-management/collaborate-with-team/project-templates/README.md)_._ In the `steps/hp_tuning` folder, you will find two step files, which can be used as a starting point for building your own hyperparameter search tailored specifically to your use case: diff --git a/docs/book/how-to/pipeline-development/build-pipelines/schedule-a-pipeline.md b/docs/book/how-to/pipeline-development/build-pipelines/schedule-a-pipeline.md index f922339393e..88e50151864 100644 --- a/docs/book/how-to/pipeline-development/build-pipelines/schedule-a-pipeline.md +++ b/docs/book/how-to/pipeline-development/build-pipelines/schedule-a-pipeline.md @@ -13,7 +13,7 @@ Schedules don't work for all orchestrators. Here is a list of all supported orch | [AirflowOrchestrator](../../../component-guide/orchestrators/airflow.md) | ✅ | | [AzureMLOrchestrator](../../../component-guide/orchestrators/azureml.md) | ✅ | | [DatabricksOrchestrator](../../../component-guide/orchestrators/databricks.md) | ✅ | -| [HyperAIOrchestrator](../../component-guide/orchestrators/hyperai.md) | ✅ | +| [HyperAIOrchestrator](../../../component-guide/orchestrators/hyperai.md) | ✅ | | [KubeflowOrchestrator](../../../component-guide/orchestrators/kubeflow.md) | ✅ | | [KubernetesOrchestrator](../../../component-guide/orchestrators/kubernetes.md) | ✅ | | [LocalOrchestrator](../../../component-guide/orchestrators/local.md) | ⛔️ | diff --git a/docs/book/how-to/pipeline-development/build-pipelines/use-failure-success-hooks.md b/docs/book/how-to/pipeline-development/build-pipelines/use-failure-success-hooks.md index 8e0be6e961c..875eba5ab75 100644 --- a/docs/book/how-to/pipeline-development/build-pipelines/use-failure-success-hooks.md +++ b/docs/book/how-to/pipeline-development/build-pipelines/use-failure-success-hooks.md @@ -68,7 +68,7 @@ Note, that **step-level** defined hooks take **precedence** over **pipeline-leve See it in action with the E2E example -_To set up the local environment used below, follow the recommendations from the_ [_Project templates_](../../project-setup-and-management/setting-up-a-project-repository/using-project-templates.md)_._ +_To set up the local environment used below, follow the recommendations from the_ [_Project templates_](../../../how-to/project-setup-and-management/collaborate-with-team/project-templates/README.md)_._ In [`steps/alerts/notify_on.py`](../../../../examples/e2e/steps/alerts/notify_on.py), you will find a step to notify the user about success and a function used to notify the user about step failure using the [Alerter](../../../component-guide/alerters/alerters.md) from the active stack. @@ -121,7 +121,7 @@ def my_step(some_parameter: int = 1) See it in action with the E2E example -_To set up the local environment used below, follow the recommendations from the_ [_Project templates_](../../project-setup-and-management/setting-up-a-project-repository/using-project-templates.md)_._ +_To set up the local environment used below, follow the recommendations from the_ [_Project templates_](../../../how-to/project-setup-and-management/collaborate-with-team/project-templates/README.md)_._ In [`steps/alerts/notify_on.py`](../../../../examples/e2e/steps/alerts/notify_on.py), you will find a step to notify the user about success and a function used to notify the user about step failure using the [Alerter](../../../component-guide/alerters/alerters.md) from the active stack. @@ -188,7 +188,7 @@ def my_step(...): See it in action with the E2E example -_To set up the local environment used below, follow the recommendations from the_ [_Project templates_](../../project-setup-and-management/setting-up-a-project-repository/using-project-templates.md)_._ +_To set up the local environment used below, follow the recommendations from the_ [_Project templates_](../../../how-to/project-setup-and-management/collaborate-with-team/project-templates/README.md)_._ In [`steps/alerts/notify_on.py`](../../../../examples/e2e/steps/alerts/notify_on.py), you will find a step to notify the user about success and a function used to notify the user about step failure using the [Alerter](../../../component-guide/alerters/alerters.md) from the active stack. diff --git a/docs/book/how-to/pipeline-development/configure-python-environments/README.md b/docs/book/how-to/pipeline-development/configure-python-environments/README.md index bb4c019b919..a5942052629 100644 --- a/docs/book/how-to/pipeline-development/configure-python-environments/README.md +++ b/docs/book/how-to/pipeline-development/configure-python-environments/README.md @@ -20,13 +20,13 @@ The client environment (sometimes known as the runner environment) is where the * A [ZenML Pro](https://zenml.io/pro) runner. * A `runner` image orchestrated by the ZenML server to start pipelines. -In all the environments, you should use your preferred package manager (e.g., `pip` or `poetry`) to manage dependencies. Ensure you install the ZenML package and any required [integrations](../../component-guide/README.md). +In all the environments, you should use your preferred package manager (e.g., `pip` or `poetry`) to manage dependencies. Ensure you install the ZenML package and any required [integrations](../../../component-guide/README.md). The client environment typically follows these key steps when starting a pipeline: 1. Compiling an intermediate pipeline representation via the `@pipeline` function. -2. Creating or triggering [pipeline and step build environments](../../component-guide/image-builders/image-builders.md) if running remotely. -3. Triggering a run in the [orchestrator](../../component-guide/orchestrators/orchestrators.md). +2. Creating or triggering [pipeline and step build environments](../../../component-guide/image-builders/image-builders.md) if running remotely. +3. Triggering a run in the [orchestrator](../../../component-guide/orchestrators/orchestrators.md). Please note that the `@pipeline` function in your code is **only ever called** in this environment. Therefore, any computational logic that is executed in the pipeline function needs to be relevant to this so-called _compile time_, rather than at _execution_ time, which happens later. @@ -40,7 +40,7 @@ See also [here](./configure-the-server-environment.md) for more on [configuring When running locally, there is no real concept of an `execution` environment as the client, server, and execution environment are all the same. However, when running a pipeline remotely, ZenML needs to transfer your code and environment over to the remote [orchestrator](../../../component-guide/orchestrators/orchestrators.md). In order to achieve this, ZenML builds Docker images known as `execution environments`. -ZenML handles the Docker image configuration, creation, and pushing, starting with a [base image](https://hub.docker.com/r/zenmldocker/zenml) containing ZenML and Python, then adding pipeline dependencies. To manage the Docker image configuration, follow the steps in the [containerize your pipeline](../../infrastructure-deployment/customize-docker-builds/README.md) guide, including specifying additional pip dependencies, using a custom parent image, and customizing the build process. +ZenML handles the Docker image configuration, creation, and pushing, starting with a [base image](https://hub.docker.com/r/zenmldocker/zenml) containing ZenML and Python, then adding pipeline dependencies. To manage the Docker image configuration, follow the steps in the [containerize your pipeline](../../../how-to/customize-docker-builds/README.md) guide, including specifying additional pip dependencies, using a custom parent image, and customizing the build process. ## Image Builder Environment diff --git a/docs/book/how-to/pipeline-development/configure-python-environments/handling-dependencies.md b/docs/book/how-to/pipeline-development/configure-python-environments/handling-dependencies.md index 8b44961a48c..d81f2d55866 100644 --- a/docs/book/how-to/pipeline-development/configure-python-environments/handling-dependencies.md +++ b/docs/book/how-to/pipeline-development/configure-python-environments/handling-dependencies.md @@ -57,7 +57,7 @@ zenml integration export-requirements INTEGRATION_NAME You can then amend and tweak those requirements as you see fit. Note that if you are using a remote orchestrator, you would then have to place the updated versions for the dependencies in a `DockerSettings` object (described in detail -[here](../../infrastructure-deployment/customize-docker-builds/docker-settings-on-a-pipeline.md)) +[here](../../../how-to/customize-docker-builds/docker-settings-on-a-pipeline.md)) which will then make sure everything is working as you need.
ZenML Scarf
diff --git a/docs/book/how-to/pipeline-development/develop-locally/keep-your-dashboard-server-clean.md b/docs/book/how-to/pipeline-development/develop-locally/keep-your-dashboard-server-clean.md index 1f1f09dff78..3c5660f8613 100644 --- a/docs/book/how-to/pipeline-development/develop-locally/keep-your-dashboard-server-clean.md +++ b/docs/book/how-to/pipeline-development/develop-locally/keep-your-dashboard-server-clean.md @@ -109,7 +109,7 @@ training_pipeline() ``` Note that pipeline names must be unique. For more information on this feature, -see the [documentation on naming pipeline runs](../../pipeline-development/build-pipelines/name-your-pipeline-and-runs.md). +see the [documentation on naming pipeline runs](../../../how-to/pipeline-development/build-pipelines/name-your-pipeline-runs.md). ## Models diff --git a/docs/book/how-to/pipeline-development/training-with-gpus/README.md b/docs/book/how-to/pipeline-development/training-with-gpus/README.md index 81ed950c3ef..ec2d0be87cf 100644 --- a/docs/book/how-to/pipeline-development/training-with-gpus/README.md +++ b/docs/book/how-to/pipeline-development/training-with-gpus/README.md @@ -41,7 +41,7 @@ def training_step(...) -> ...: Please refer to the source code and documentation of each orchestrator to find out which orchestrator supports specifying resources in what way. {% hint style="info" %} -If you're using an orchestrator which does not support this feature or its underlying infrastructure does not cover your requirements, you can also take a look at [step operators](../../component-guide/step-operators/step-operators.md) which allow you to execute individual steps of your pipeline in environments independent of your orchestrator. +If you're using an orchestrator which does not support this feature or its underlying infrastructure does not cover your requirements, you can also take a look at [step operators](../../../component-guide/step-operators/step-operators.md) which allow you to execute individual steps of your pipeline in environments independent of your orchestrator. {% endhint %} ### Ensure your container is CUDA-enabled @@ -56,7 +56,7 @@ All steps running on GPU-backed hardware will be executed within a containerized #### 1. **Specify a CUDA-enabled parent image in your `DockerSettings`** -For complete details, refer to the [containerization page](../../customize-docker-builds/README.md) that explains how to do this. As an example, if you want to use the latest CUDA-enabled official PyTorch image for your entire pipeline run, you can include the following code: +For complete details, refer to the [containerization page](../../../how-to/customize-docker-builds/README.md) that explains how to do this. As an example, if you want to use the latest CUDA-enabled official PyTorch image for your entire pipeline run, you can include the following code: ```python from zenml import pipeline diff --git a/docs/book/how-to/pipeline-development/use-configuration-files/runtime-configuration.md b/docs/book/how-to/pipeline-development/use-configuration-files/runtime-configuration.md index d32aab76652..1c9f0cd4bfe 100644 --- a/docs/book/how-to/pipeline-development/use-configuration-files/runtime-configuration.md +++ b/docs/book/how-to/pipeline-development/use-configuration-files/runtime-configuration.md @@ -11,7 +11,7 @@ Stack Component Config vs Settings in ZenML Part of the configuration of a pipeline are its `Settings`. These allow you to configure runtime configurations for stack components and pipelines. Concretely, they allow you to configure: * The [resources](../../advanced-topics/training-with-gpus/training-with-gpus.md#specify-resource-requirements-for-steps) required for a step -* Configuring the [containerization](../../infrastructure-deployment/customize-docker-builds/README.md) process of a pipeline (e.g. What requirements get installed in the Docker image) +* Configuring the [containerization](../../../how-to/customize-docker-builds/README.md) process of a pipeline (e.g. What requirements get installed in the Docker image) * Stack component-specific configuration, e.g., if you have an experiment tracker passing in the name of the experiment at runtime You will learn about all of the above in more detail later, but for now, let's try to understand that all of this configuration flows through one central concept called `BaseSettings`. (From here on, we use `settings` and `BaseSettings` as analogous in this guide). @@ -21,8 +21,8 @@ You will learn about all of the above in more detail later, but for now, let's t Settings are categorized into two types: * **General settings** that can be used on all ZenML pipelines. Examples of these are: - * [`DockerSettings`](../customize-docker-builds/README.md) to specify Docker settings. - * [`ResourceSettings`](../training-with-gpus/training-with-gpus.md) to specify resource settings. + * [`DockerSettings`](../../../how-to/customize-docker-builds/README.md) to specify Docker settings. + * [`ResourceSettings`](../../../how-to/pipeline-development/training-with-gpus/README.md) to specify resource settings. * **Stack-component-specific settings**: These can be used to supply runtime configurations to certain stack components (the key should be `` or `.`). Settings for components not in the active stack will be ignored. Examples of these are: * [`SkypilotAWSOrchestratorSettings`](https://sdkdocs.zenml.io/latest/integration_code_docs/integrations-skypilot_aws/#zenml.integrations.skypilot_aws.flavors.skypilot_orchestrator_aws_vm_flavor.SkypilotAWSOrchestratorSettings) to specify Skypilot settings (works for `SkypilotGCPOrchestratorSettings` and `SkypilotAzureOrchestratorSettings` as well). * [`KubeflowOrchestratorSettings`](https://sdkdocs.zenml.io/latest/integration_code_docs/integrations-kubeflow/#zenml.integrations.kubeflow.flavors.kubeflow_orchestrator_flavor.KubeflowOrchestratorSettings) to specify Kubeflow settings. diff --git a/docs/book/how-to/pipeline-development/use-configuration-files/what-can-be-configured.md b/docs/book/how-to/pipeline-development/use-configuration-files/what-can-be-configured.md index 5ec7c57f782..deab253dcb6 100644 --- a/docs/book/how-to/pipeline-development/use-configuration-files/what-can-be-configured.md +++ b/docs/book/how-to/pipeline-development/use-configuration-files/what-can-be-configured.md @@ -121,7 +121,7 @@ enable_step_logs: True ### `build` ID -The UUID of the [`build`](../../infrastructure-deployment/customize-docker-builds/README.md) to use for this pipeline. If specified, Docker image building is skipped for remote orchestrators, and the Docker image specified in this build is used. +The UUID of the [`build`](../../../how-to/customize-docker-builds/README.md) to use for this pipeline. If specified, Docker image building is skipped for remote orchestrators, and the Docker image specified in this build is used. ```yaml build: @@ -207,7 +207,7 @@ settings: ``` {% hint style="info" %} -Find a complete list of all Docker Settings [here](https://sdkdocs.zenml.io/latest/core\_code\_docs/core-config/#zenml.config.docker\_settings.DockerSettings). To learn more about pipeline containerization consult our documentation on this [here](../../infrastructure-deployment/customize-docker-builds/README.md). +Find a complete list of all Docker Settings [here](https://sdkdocs.zenml.io/latest/core\_code\_docs/core-config/#zenml.config.docker\_settings.DockerSettings). To learn more about pipeline containerization consult our documentation on this [here](../../../how-to/customize-docker-builds/README.md). {% endhint %} ### Resource Settings diff --git a/docs/book/how-to/popular-integrations/aws-guide.md b/docs/book/how-to/popular-integrations/aws-guide.md index c6750d63756..3512f192b20 100644 --- a/docs/book/how-to/popular-integrations/aws-guide.md +++ b/docs/book/how-to/popular-integrations/aws-guide.md @@ -10,9 +10,9 @@ This page aims to quickly set up a minimal production stack on AWS. With just a Would you like to skip ahead and deploy a full AWS ZenML cloud stack already? Check out the -[in-browser stack deployment wizard](../../infrastructure-deployment/stack-deployment/deploy-a-cloud-stack.md), -the [stack registration wizard](../../infrastructure-deployment/stack-deployment/register-a-cloud-stack.md), -or [the ZenML AWS Terraform module](../../infrastructure-deployment/stack-deployment/deploy-a-cloud-stack-with-terraform.md) +[in-browser stack deployment wizard](../../how-to/infrastructure-deployment/stack-deployment/deploy-a-cloud-stack.md), +the [stack registration wizard](../../how-to/infrastructure-deployment/stack-deployment/register-a-cloud-stack.md), +or [the ZenML AWS Terraform module](../../how-to/infrastructure-deployment/stack-deployment/deploy-a-cloud-stack-with-terraform.md) for a shortcut on how to deploy & register this stack. {% endhint %} @@ -22,7 +22,7 @@ for a shortcut on how to deploy & register this stack. To follow this guide, you need: * An active AWS account with necessary permissions for AWS S3, SageMaker, ECR, and ECS. -* ZenML [installed](../../../getting-started/installation.md) +* ZenML [installed](../../getting-started/installation.md) * AWS CLI installed and configured with your AWS credentials. You can follow the instructions [here](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html). Once ready, navigate to the AWS console: @@ -110,7 +110,7 @@ Replace `` with the ARN of the IAM role you created in the previous st ### Artifact Store (S3) -An [artifact store](../../../user-guide/production-guide/remote-storage.md) is used for storing and versioning data flowing through your pipelines. +An [artifact store](../../user-guide/production-guide/remote-storage.md) is used for storing and versioning data flowing through your pipelines. 1. Before you run anything within the ZenML CLI, create an AWS S3 bucket. If you already have one, you can skip this step. (Note: the bucket name should be unique, so you might need to try a few times to find a unique name.) @@ -126,11 +126,11 @@ Once this is done, you can create the ZenML stack component as follows: zenml artifact-store register cloud_artifact_store -f s3 --path=s3://bucket-name --connector aws_connector ``` -More details [here](../../../component-guide/artifact-stores/s3.md). +More details [here](../../component-guide/artifact-stores/s3.md). ### Orchestrator (SageMaker Pipelines) -An [orchestrator](../../../user-guide/production-guide/cloud-orchestration.md) is the compute backend to run your pipelines. +An [orchestrator](../../user-guide/production-guide/cloud-orchestration.md) is the compute backend to run your pipelines. 1. Before you run anything within the ZenML CLI, head on over to AWS and create a SageMaker domain (Skip this if you already have one). The instructions for creating a domain can be found [in the AWS core documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/onboard-quick-start.html). @@ -154,11 +154,11 @@ zenml orchestrator register sagemaker-orchestrator --flavor=sagemaker --region=< **Note**: The SageMaker orchestrator utilizes the AWS configuration for operation and does not require direct connection via a service connector for authentication, as it relies on your AWS CLI configurations or environment variables. -More details [here](../../../component-guide/orchestrators/sagemaker.md). +More details [here](../../component-guide/orchestrators/sagemaker.md). ### Container Registry (ECR) -A [container registry](../../../component-guide/container-registries/container-registries.md) is used to store Docker images for your pipelines. +A [container registry](../../component-guide/container-registries/container-registries.md) is used to store Docker images for your pipelines. 1. You'll need to create a repository in ECR. If you already have one, you can skip this step. @@ -174,7 +174,7 @@ Once this is done, you can create the ZenML stack component as follows: zenml container-registry register ecr-registry --flavor=aws --uri=.dkr.ecr..amazonaws.com --connector aws-connector ``` -More details [here](../../../component-guide/container-registries/aws.md). +More details [here](../../component-guide/container-registries/aws.md). ## 4) Create stack @@ -226,7 +226,7 @@ python run.py

Sequence of events that happen when running a pipeline on a remote stack with a code repository

-Read more in the [production guide](../../../user-guide/production-guide/production-guide.md). +Read more in the [production guide](../../user-guide/production-guide/README.md). ## Cleanup diff --git a/docs/book/how-to/popular-integrations/azure-guide.md b/docs/book/how-to/popular-integrations/azure-guide.md index 8fdc64f82d8..dee02ba747b 100644 --- a/docs/book/how-to/popular-integrations/azure-guide.md +++ b/docs/book/how-to/popular-integrations/azure-guide.md @@ -12,9 +12,9 @@ with correct permissions and the relevant ZenML stack and components. Would you like to skip ahead and deploy a full Azure ZenML cloud stack already? Check out the -[in-browser stack deployment wizard](../../infrastructure-deployment/stack-deployment/deploy-a-cloud-stack.md), -the [stack registration wizard](../../infrastructure-deployment/stack-deployment/register-a-cloud-stack.md), -or [the ZenML Azure Terraform module](../../infrastructure-deployment/stack-deployment/deploy-a-cloud-stack-with-terraform.md) +[in-browser stack deployment wizard](../../how-to/infrastructure-deployment/stack-deployment/deploy-a-cloud-stack.md), +the [stack registration wizard](../../how-to/infrastructure-deployment/stack-deployment/register-a-cloud-stack.md), +or [the ZenML Azure Terraform module](../../how-to/infrastructure-deployment/stack-deployment/deploy-a-cloud-stack-with-terraform.md) for a shortcut on how to deploy & register this stack. {% endhint %} @@ -91,8 +91,7 @@ assign the role accordingly. ## 4. Create a service connector -Now you have everything set up, you can go ahead and create [a ZenML Azure -Service Connector](../../infrastructure-deployment/auth-management/azure-service-connector.md). +Now you have everything set up, you can go ahead and create [a ZenML Azure Service Connector](../../how-to/infrastructure-deployment/auth-management/azure-service-connector.md). ```bash zenml service-connector register azure_connector --type azure \ diff --git a/docs/book/how-to/popular-integrations/gcp-guide.md b/docs/book/how-to/popular-integrations/gcp-guide.md index 2634d2dc368..463d52e1660 100644 --- a/docs/book/how-to/popular-integrations/gcp-guide.md +++ b/docs/book/how-to/popular-integrations/gcp-guide.md @@ -10,9 +10,9 @@ This page aims to quickly set up a minimal production stack on GCP. With just a Would you like to skip ahead and deploy a full GCP ZenML cloud stack already? Check out the -[in-browser stack deployment wizard](../../infrastructure-deployment/stack-deployment/deploy-a-cloud-stack.md), -the [stack registration wizard](../../infrastructure-deployment/stack-deployment/register-a-cloud-stack.md), -or [the ZenML GCP Terraform module](../../infrastructure-deployment/stack-deployment/deploy-a-cloud-stack-with-terraform.md) +[in-browser stack deployment wizard](../../how-to/infrastructure-deployment/stack-deployment/deploy-a-cloud-stack.md), +the [stack registration wizard](../../how-to/infrastructure-deployment/stack-deployment/register-a-cloud-stack.md), +or [the ZenML GCP Terraform module](../../how-to/infrastructure-deployment/stack-deployment/deploy-a-cloud-stack-with-terraform.md) for a shortcut on how to deploy & register this stack. {% endhint %} diff --git a/docs/book/how-to/popular-integrations/kubeflow.md b/docs/book/how-to/popular-integrations/kubeflow.md index 07dde1868c9..f1679caa781 100644 --- a/docs/book/how-to/popular-integrations/kubeflow.md +++ b/docs/book/how-to/popular-integrations/kubeflow.md @@ -22,7 +22,7 @@ To use the Kubeflow Orchestrator, you'll need: There are two ways to configure the orchestrator: -1. Using a [Service Connector](../../infrastructure-deployment/auth-management/service-connectors-guide.md) to connect to the remote cluster (recommended for cloud-managed clusters). No local `kubectl` context needed. +1. Using a [Service Connector](../../how-to/infrastructure-deployment/auth-management/service-connectors-guide.md) to connect to the remote cluster (recommended for cloud-managed clusters). No local `kubectl` context needed. ```bash zenml orchestrator register --flavor kubeflow diff --git a/docs/book/how-to/popular-integrations/kubernetes.md b/docs/book/how-to/popular-integrations/kubernetes.md index 4ca1b125429..737eddab752 100644 --- a/docs/book/how-to/popular-integrations/kubernetes.md +++ b/docs/book/how-to/popular-integrations/kubernetes.md @@ -29,7 +29,7 @@ The Kubernetes orchestrator requires a Kubernetes cluster in order to run. There There are two ways to configure the orchestrator: -1. Using a [Service Connector](../../infrastructure-deployment/auth-management/service-connectors-guide.md) to connect to the remote cluster. This is the recommended approach, especially for cloud-managed clusters. No local `kubectl` context is needed. +1. Using a [Service Connector](../../how-to/infrastructure-deployment/auth-management/service-connectors-guide.md) to connect to the remote cluster. This is the recommended approach, especially for cloud-managed clusters. No local `kubectl` context is needed. ```bash zenml orchestrator register --flavor kubernetes diff --git a/docs/book/how-to/project-setup-and-management/collaborate-with-team/access-management.md b/docs/book/how-to/project-setup-and-management/collaborate-with-team/access-management.md index 1c88dcb964a..c58899c9c55 100644 --- a/docs/book/how-to/project-setup-and-management/collaborate-with-team/access-management.md +++ b/docs/book/how-to/project-setup-and-management/collaborate-with-team/access-management.md @@ -59,7 +59,7 @@ Performing the upgrade itself is a task that typically falls on the MLOps Platfo - ensure that all data is backed up before performing the upgrade - no service disruption or downtime happens during the upgrade -and more. Read in detail about the best practices for upgrading your ZenML server in the [Best Practices for Upgrading ZenML Servers](../../advanced-topics/manage-zenml-server/best-practices-upgrading-zenml.md) guide. +and more. Read in detail about the best practices for upgrading your ZenML server in the [Best Practices for Upgrading ZenML Servers](../../../how-to/manage-zenml-server/best-practices-upgrading-zenml.md) guide. ## Who is responsible for migrating and maintaining pipelines? @@ -68,7 +68,7 @@ When you upgrade to a new version of ZenML, you might have to test if your code The pipeline code itself is typically owned by the Data Scientist, but the Platform Engineer is responsible for making sure that new changes can be tested in a safe environment without impacting existing workflows. This involves setting up a new server and doing a staged upgrade and other strategies. -The Data Scientist should also check out the release notes, and the migration guide where applicable when upgrading the code. Read more about the best practices for upgrading your ZenML server and your code in the [Best Practices for Upgrading ZenML Servers](../../advanced-topics/manage-zenml-server/best-practices-upgrading-zenml.md) guide. +The Data Scientist should also check out the release notes, and the migration guide where applicable when upgrading the code. Read more about the best practices for upgrading your ZenML server and your code in the [Best Practices for Upgrading ZenML Servers](../../../how-to/manage-zenml-server/best-practices-upgrading-zenml.md) guide. ## Best Practices for Access Management diff --git a/docs/book/how-to/project-setup-and-management/collaborate-with-team/shared-components-for-teams.md b/docs/book/how-to/project-setup-and-management/collaborate-with-team/shared-components-for-teams.md index 8db521db120..89046e40bf0 100644 --- a/docs/book/how-to/project-setup-and-management/collaborate-with-team/shared-components-for-teams.md +++ b/docs/book/how-to/project-setup-and-management/collaborate-with-team/shared-components-for-teams.md @@ -104,7 +104,7 @@ way, for example: my-simple-package==0.1.0 ``` -For information on using private PyPI repositories to share your code, see our [documentation on how to use a private PyPI repository](../customize-docker-builds/how-to-use-a-private-pypi-repository.md). +For information on using private PyPI repositories to share your code, see our [documentation on how to use a private PyPI repository](../../../how-to/customize-docker-builds/how-to-use-a-private-pypi-repository.md). ## Best Practices diff --git a/docs/book/how-to/project-setup-and-management/collaborate-with-team/stacks-pipelines-models.md b/docs/book/how-to/project-setup-and-management/collaborate-with-team/stacks-pipelines-models.md index 40577a85aa7..9e444bb74ed 100644 --- a/docs/book/how-to/project-setup-and-management/collaborate-with-team/stacks-pipelines-models.md +++ b/docs/book/how-to/project-setup-and-management/collaborate-with-team/stacks-pipelines-models.md @@ -41,7 +41,7 @@ preparation, model training, and evaluation. It's a good practice to have a separate pipeline for different tasks like training and inference. This makes your pipelines more modular and easier to manage. Here's some of the benefits: -- Separation of pipelines by the nature of the task allows you to [run them independently as needed](../develop-locally/local-prod-pipeline-variants.md). For example, you might train a model in a training pipeline only once a week but run inference on new data every day. +- Separation of pipelines by the nature of the task allows you to [run them independently as needed](../../../how-to/pipeline-development/develop-locally/local-prod-pipeline-variants.md). For example, you might train a model in a training pipeline only once a week but run inference on new data every day. - It becomes easier to manage and update your code as your project grows more complex. - Different people can work on the code for the pipelines without interfering with each other. - It helps you organize your runs better. @@ -69,7 +69,7 @@ Here's how the workflow would look like with ZenML: - They create three pipelines: one for feature engineering, one for training the model, and one for producing predictions. - They set up a [repository for their project](../../project-setup-and-management/setting-up-a-project-repository/README.md) and start building their pipelines collaboratively. Let's assume Bob builds the feature engineering and training pipeline and Alice builds the inference pipeline. - To test their pipelines locally, they both have a `default` stack with a local orchestrator and a local artifact store. This allows them to quickly iterate on their code without deploying any infrastructure or incurring any costs. -- While building the inference pipeline, Alice needs to make sure that the preprocessing step in her pipeline is the same as the one used while training. It might even involve the use of libraries that are not publicily available and she follows the [Shared Libraries and Logic for Teams](./shared_components_for_teams.md) guide to help with this. +- While building the inference pipeline, Alice needs to make sure that the preprocessing step in her pipeline is the same as the one used while training. It might even involve the use of libraries that are not publicily available and she follows the [Shared Libraries and Logic for Teams](./shared-components-for-teams.md) guide to help with this. - Bob's training pipeline produces a model artifact, which Alice's inference pipeline requires as input. It also produces other artifacts such as metrics and a model checkpoint that are logged as artifacts in the pipeline run. - To allow easy access to model and data artifacts, they [use a ZenML Model](../../model-management-metrics/model-control-plane/associate-a-pipeline-with-a-model.md) which ties the pipelines, models and artifacts together. Now Alice can just [reference the right model name and find the model artifact she needs.](../../model-management-metrics/model-control-plane/load-artifacts-from-model.md) - It is also critical that the right model version from the training pipeline is used in the inference pipeline. The [Model Control Plane](../../model-management-metrics/model-control-plane/README.md) helps Bob to keep track of the different versions and to easily compare them. Bob can then [promote the best performing model version to the `production` stage](../../model-management-metrics/model-control-plane/promote-a-model.md) which Alice's pipeline can then consume. diff --git a/docs/book/how-to/project-setup-and-management/setting-up-a-project-repository/README.md b/docs/book/how-to/project-setup-and-management/setting-up-a-project-repository/README.md index 7b9cb33a580..310a276e814 100644 --- a/docs/book/how-to/project-setup-and-management/setting-up-a-project-repository/README.md +++ b/docs/book/how-to/project-setup-and-management/setting-up-a-project-repository/README.md @@ -21,7 +21,7 @@ A clean and organized repository structure is essential for any ZenML project. T - Clear separation of concerns between different components - Consistent naming conventions -Learn more about setting up your repository in the [Set up repository guide](./best-practices.md). +Learn more about setting up your repository in the [Set up repository guide](./set-up-repository.md). ### Version Control and Collaboration @@ -31,7 +31,7 @@ Integrating your ZenML project with version control systems like Git is crucial - Easy tracking of changes - Collaboration among team members -Discover how to connect your Git repository in the [Set up a repository guide](./best-practices.md). +Discover how to connect your Git repository in the [Set up a repository guide](./set-up-repository.md). ### Stacks, Pipelines, Models, and Artifacts @@ -42,7 +42,7 @@ Understanding the relationship between stacks, models, and pipelines is key to d - Pipelines: Encapsulate your ML workflows - Artifacts: Track your data and model outputs -Learn about organizing these components in the [Organizing Stacks, Pipelines, Models, and Artifacts guide](./stacks-pipelines-models.md). +Learn about organizing these components in the [Organizing Stacks, Pipelines, Models, and Artifacts guide](../collaborate-with-team/stacks-pipelines-models.md). ### Access Management and Roles @@ -53,7 +53,7 @@ Proper access management ensures that team members have the right permissions an - Establish processes for pipeline maintenance and server upgrades - Leverage [Teams in ZenML Pro](../../../getting-started/zenml-pro/teams.md) to assign roles and permissions to a group of users, to mimic your real-world team roles. -Explore access management strategies in the [Access Management and Roles guide](./access-management-and-roles.md). +Explore access management strategies in the [Access Management and Roles guide](../collaborate-with-team/access-management.md). ### Shared Components and Libraries @@ -63,7 +63,7 @@ Leverage shared components and libraries to promote code reuse and standardizati - Shared private wheels for internal distribution - Handling authentication for specific libraries -Find out more about sharing code in the [Shared Libraries and Logic for Teams guide](./shared_components_for_teams.md). +Find out more about sharing code in the [Shared Libraries and Logic for Teams guide](../collaborate-with-team/shared-components-for-teams.md). ### Project Templates @@ -72,7 +72,7 @@ Utilize project templates to kickstart your ZenML projects and ensure consistenc - Use pre-made templates for common use cases - Create custom templates tailored to your team's needs -Learn about using and creating project templates in the [Project Templates guide](./project-templates.md). +Learn about using and creating project templates in the [Project Templates guide](../collaborate-with-team/project-templates/README.md). ### Migration and Maintenance diff --git a/docs/book/how-to/project-setup-and-management/setting-up-a-project-repository/connect-your-git-repository.md b/docs/book/how-to/project-setup-and-management/setting-up-a-project-repository/connect-your-git-repository.md index d2e82e82a58..b80b6961f4f 100644 --- a/docs/book/how-to/project-setup-and-management/setting-up-a-project-repository/connect-your-git-repository.md +++ b/docs/book/how-to/project-setup-and-management/setting-up-a-project-repository/connect-your-git-repository.md @@ -10,9 +10,9 @@ A code repository in ZenML refers to a remote storage location for your code. So

A visual representation of how the code repository fits into the general ZenML architecture.

-Code repositories enable ZenML to keep track of the code version that you use for your pipeline runs. Additionally, running a pipeline that is tracked in a registered code repository can [speed up the Docker image building for containerized stack components](../../infrastructure-deployment/customize-docker-builds/use-code-repositories-to-speed-up-docker-build-times.md) by eliminating the need to rebuild Docker images each time you change one of your source code files. +Code repositories enable ZenML to keep track of the code version that you use for your pipeline runs. Additionally, running a pipeline that is tracked in a registered code repository can [speed up the Docker image building for containerized stack components](../../../how-to/customize-docker-builds/how-to-reuse-builds.md) by eliminating the need to rebuild Docker images each time you change one of your source code files. -Learn more about how code repositories benefit development [here](../../infrastructure-deployment/customize-docker-builds/use-code-repositories-to-speed-up-docker-build-times.md). +Learn more about how code repositories benefit development [here](../../../how-to/customize-docker-builds/how-to-reuse-builds.md). ## Registering a code repository diff --git a/docs/book/how-to/project-setup-and-management/setting-up-a-project-repository/set-up-repository.md b/docs/book/how-to/project-setup-and-management/setting-up-a-project-repository/set-up-repository.md index 5c082b24dd4..672451675f9 100644 --- a/docs/book/how-to/project-setup-and-management/setting-up-a-project-repository/set-up-repository.md +++ b/docs/book/how-to/project-setup-and-management/setting-up-a-project-repository/set-up-repository.md @@ -90,7 +90,7 @@ Containerized orchestrators and step operators load your complete project files #### Dockerfile (optional) -By default, ZenML uses the official [zenml Docker image](https://hub.docker.com/r/zenmldocker/zenml) as a base for all pipeline and step builds. You can use your own `Dockerfile` to overwrite this behavior. Learn more [here](../../infrastructure-deployment/customize-docker-builds/README.md). +By default, ZenML uses the official [zenml Docker image](https://hub.docker.com/r/zenmldocker/zenml) as a base for all pipeline and step builds. You can use your own `Dockerfile` to overwrite this behavior. Learn more [here](../../../how-to/customize-docker-builds/README.md). #### Notebooks diff --git a/docs/book/reference/api-reference.md b/docs/book/reference/api-reference.md index 99169e0a6db..a297dce29c0 100644 --- a/docs/book/reference/api-reference.md +++ b/docs/book/reference/api-reference.md @@ -22,7 +22,7 @@ If you are using the ZenML server API using the above pages, it is enough to be account in the same browser session. However, in order to do this programmatically, the following steps need to be followed: -1. Create a [service account](../how-to/connecting-to-zenml/connect-with-a-service-account.md): +1. Create a [service account](../how-to/manage-zenml-server/connecting-to-zenml/connect-with-a-service-account.md): ```shell zenml service-account create myserviceaccount diff --git a/docs/book/reference/how-do-i.md b/docs/book/reference/how-do-i.md index 4ac076dd435..c278b6c0d5d 100644 --- a/docs/book/reference/how-do-i.md +++ b/docs/book/reference/how-do-i.md @@ -45,11 +45,11 @@ Please read our [general information on how to compose steps + pipelines togethe * **templates**: using starter code with ZenML? -[Project templates](../how-to/setting-up-a-project-repository/using-project-templates.md) allow you to get going quickly with ZenML. We recommend the Starter template (`starter`) for most use cases which gives you a basic scaffold and structure around which you can write your own code. You can also build templates for others inside a Git repository and use them with ZenML's templates functionality. +[Project templates](../how-to/project-setup-and-management/collaborate-with-team/project-templates/README.md) allow you to get going quickly with ZenML. We recommend the Starter template (`starter`) for most use cases which gives you a basic scaffold and structure around which you can write your own code. You can also build templates for others inside a Git repository and use them with ZenML's templates functionality. * **upgrade** my ZenML client and/or server? -Upgrading your ZenML client package is as simple as running `pip install --upgrade zenml` in your terminal. For upgrading your ZenML server, please refer to [the dedicated documentation section](../getting-started/deploying-zenml/manage-the-deployed-services/upgrade-the-version-of-the-zenml-server.md) which covers most of the ways you might do this as well as common troubleshooting steps. +Upgrading your ZenML client package is as simple as running `pip install --upgrade zenml` in your terminal. For upgrading your ZenML server, please refer to [the dedicated documentation section](../how-to/manage-zenml-server/upgrade-zenml-server.md) which covers most of the ways you might do this as well as common troubleshooting steps. * use a \ stack component? diff --git a/docs/book/user-guide/llmops-guide/finetuning-llms/finetuning-with-accelerate.md b/docs/book/user-guide/llmops-guide/finetuning-llms/finetuning-with-accelerate.md index def093ac5ae..f927b37884b 100644 --- a/docs/book/user-guide/llmops-guide/finetuning-llms/finetuning-with-accelerate.md +++ b/docs/book/user-guide/llmops-guide/finetuning-llms/finetuning-with-accelerate.md @@ -37,7 +37,7 @@ steps: - **finetune**: We finetune the model on the Viggo dataset. - **evaluate_base**: We evaluate the base model (i.e. the model before finetuning) on the Viggo dataset. - **evaluate_finetuned**: We evaluate the finetuned model on the Viggo dataset. -- **promote**: We promote the best performing model to "staging" in the [Model Control Plane](../../../how-to/use-the-model-control-plane/README.md). +- **promote**: We promote the best performing model to "staging" in the [Model Control Plane](../../../how-to/model-management-metrics/model-control-plane/README.md). If you adapt the code to your own use case, the specific logic in each step might differ but the overall structure should remain the same. When you're diff --git a/docs/book/user-guide/llmops-guide/rag-with-zenml/storing-embeddings-in-a-vector-database.md b/docs/book/user-guide/llmops-guide/rag-with-zenml/storing-embeddings-in-a-vector-database.md index c3d9ab9fbcc..70148b1e0b9 100644 --- a/docs/book/user-guide/llmops-guide/rag-with-zenml/storing-embeddings-in-a-vector-database.md +++ b/docs/book/user-guide/llmops-guide/rag-with-zenml/storing-embeddings-in-a-vector-database.md @@ -116,7 +116,7 @@ Note that we're inserting the documents into the embeddings table as well as the Deciding when to update your embeddings is a separate discussion and depends on the specific use case. If your data is frequently changing, and the changes are significant, you might want to fully reset the embeddings with each update. In other cases, you might just want to add new documents and embeddings into the database because the changes are minor or infrequent. In the code above, we choose to only add new embeddings if they don't already exist in the database. {% hint style="info" %} -Depending on the size of your dataset and the number of embeddings you're storing, you might find that running this step on a CPU is too slow. In that case, you should ensure that this step runs on a GPU-enabled machine to speed up the process. You can do this with ZenML by using a step operator that runs on a GPU-enabled machine. See [the docs here](../../../component-guide/step-operators/README.md) for more on how to set this up. +Depending on the size of your dataset and the number of embeddings you're storing, you might find that running this step on a CPU is too slow. In that case, you should ensure that this step runs on a GPU-enabled machine to speed up the process. You can do this with ZenML by using a step operator that runs on a GPU-enabled machine. See [the docs here](../../../component-guide/step-operators/step-operators.md) for more on how to set this up. {% endhint %} We also generate an index for the embeddings using the `ivfflat` method with the `vector_cosine_ops` operator. This is a common method for indexing high-dimensional vectors in PostgreSQL and is well-suited for similarity search using cosine distance. The number of lists is calculated based on the number of records in the table, with a minimum of 10 lists and a maximum of the square root of the number of records. This is a good starting point for tuning the index parameters, but you might want to experiment with different values to see how they affect the performance of your RAG pipeline. diff --git a/docs/book/user-guide/production-guide/ci-cd.md b/docs/book/user-guide/production-guide/ci-cd.md index eee740d49a9..d92d6e88959 100644 --- a/docs/book/user-guide/production-guide/ci-cd.md +++ b/docs/book/user-guide/production-guide/ci-cd.md @@ -30,7 +30,7 @@ template: you can fork it and easily adapt it to your own MLOps stack, infrastru ### Configure an API Key in ZenML In order to facilitate machine-to-machine connection you need to create an API key within ZenML. Learn more about those -[here](../../how-to/connecting-to-zenml/connect-with-a-service-account.md). +[here](../../how-to/manage-zenml-server/connecting-to-zenml/connect-with-a-service-account.md). ```bash zenml service-account create github_action_api_key diff --git a/docs/book/user-guide/production-guide/cloud-orchestration.md b/docs/book/user-guide/production-guide/cloud-orchestration.md index 107d5e9b625..d3aab11e46d 100644 --- a/docs/book/user-guide/production-guide/cloud-orchestration.md +++ b/docs/book/user-guide/production-guide/cloud-orchestration.md @@ -27,7 +27,7 @@ for a shortcut on how to deploy & register a cloud stack. The easiest cloud orchestrator to start with is the [Skypilot](https://skypilot.readthedocs.io/) orchestrator running on a public cloud. The advantage of Skypilot is that it simply provisions a VM to execute the pipeline on your cloud provider. -Coupled with Skypilot, we need a mechanism to package your code and ship it to the cloud for Skypilot to do its thing. ZenML uses [Docker](https://www.docker.com/) to achieve this. Every time you run a pipeline with a remote orchestrator, [ZenML builds an image](../../how-to/setting-up-a-project-repository/connect-your-git-repository.md) for the entire pipeline (and optionally each step of a pipeline depending on your [configuration](../../how-to/customize-docker-builds/README.md)). This image contains the code, requirements, and everything else needed to run the steps of the pipeline in any environment. ZenML then pushes this image to the container registry configured in your stack, and the orchestrator pulls the image when it's ready to execute a step. +Coupled with Skypilot, we need a mechanism to package your code and ship it to the cloud for Skypilot to do its thing. ZenML uses [Docker](https://www.docker.com/) to achieve this. Every time you run a pipeline with a remote orchestrator, [ZenML builds an image](../../how-to/project-setup-and-management/setting-up-a-project-repository/connect-your-git-repository.md) for the entire pipeline (and optionally each step of a pipeline depending on your [configuration](../../how-to/customize-docker-builds/README.md)). This image contains the code, requirements, and everything else needed to run the steps of the pipeline in any environment. ZenML then pushes this image to the container registry configured in your stack, and the orchestrator pulls the image when it's ready to execute a step. To summarize, here is the broad sequence of events that happen when you run a pipeline with such a cloud stack: diff --git a/docs/book/user-guide/production-guide/connect-code-repository.md b/docs/book/user-guide/production-guide/connect-code-repository.md index 5e272a6f437..8153141d234 100644 --- a/docs/book/user-guide/production-guide/connect-code-repository.md +++ b/docs/book/user-guide/production-guide/connect-code-repository.md @@ -25,7 +25,7 @@ By connecting a Git repository, you avoid redundant builds and make your MLOps p ## Creating a GitHub Repository -While ZenML supports [many different flavors of git repositories](../../how-to/setting-up-a-project-repository/connect-your-git-repository.md), this guide will focus on [GitHub](https://github.com). To create a repository on GitHub: +While ZenML supports [many different flavors of git repositories](../../how-to/project-setup-and-management/setting-up-a-project-repository/connect-your-git-repository.md), this guide will focus on [GitHub](https://github.com). To create a repository on GitHub: 1. Sign in to [GitHub](https://github.com/). 2. Click the "+" icon and select "New repository." @@ -101,6 +101,6 @@ python run.py --training-pipeline python run.py --training-pipeline ``` -You can read more about [the ZenML Git Integration here](../../how-to/setting-up-a-project-repository/connect-your-git-repository.md). +You can read more about [the ZenML Git Integration here](../../how-to/project-setup-and-management/setting-up-a-project-repository/connect-your-git-repository.md).
ZenML Scarf
diff --git a/docs/book/user-guide/production-guide/deploying-zenml.md b/docs/book/user-guide/production-guide/deploying-zenml.md index 7414141c371..047ce5ed356 100644 --- a/docs/book/user-guide/production-guide/deploying-zenml.md +++ b/docs/book/user-guide/production-guide/deploying-zenml.md @@ -47,7 +47,7 @@ zenml login ``` {% hint style="info" %} -Having trouble connecting with a browser? There are other ways to connect. Read [here](../../how-to/connecting-to-zenml/README.md) for more details. +Having trouble connecting with a browser? There are other ways to connect. Read [here](../../how-to/manage-zenml-server/connecting-to-zenml/README.md) for more details. {% endhint %} This command will start a series of steps to validate the device from where you are connecting that will happen in your browser. After that, you're now locally connected to a remote ZenML. Nothing of your experience changes, except that all metadata that you produce will be tracked centrally in one place from now on. diff --git a/docs/book/user-guide/production-guide/end-to-end.md b/docs/book/user-guide/production-guide/end-to-end.md index 70e68eb5445..e1349a6c144 100644 --- a/docs/book/user-guide/production-guide/end-to-end.md +++ b/docs/book/user-guide/production-guide/end-to-end.md @@ -24,7 +24,7 @@ pip install "zenml[templates,server]" notebook zenml integration install sklearn -y ``` -We will then use [ZenML templates](../../how-to/setting-up-a-project-repository/using-project-templates.md) to help us get the code we need for the project: +We will then use [ZenML templates](../../how-to/project-setup-and-management/collaborate-with-team/project-templates/README.md) to help us get the code we need for the project: ```bash mkdir zenml_batch_e2e diff --git a/docs/book/user-guide/starter-guide/starter-project.md b/docs/book/user-guide/starter-guide/starter-project.md index f5a95fb9b37..0b29979f5fe 100644 --- a/docs/book/user-guide/starter-guide/starter-project.md +++ b/docs/book/user-guide/starter-guide/starter-project.md @@ -21,7 +21,7 @@ pip install "zenml[templates,server]" notebook zenml integration install sklearn -y ``` -We will then use [ZenML templates](../../how-to/setting-up-a-project-repository/using-project-templates.md) to help us get the code we need for the project: +We will then use [ZenML templates](../../how-to/project-setup-and-management/collaborate-with-team/project-templates/README.md) to help us get the code we need for the project: ```bash mkdir zenml_starter diff --git a/scripts/check_and_comment.py b/scripts/check_and_comment.py new file mode 100644 index 00000000000..b96a7ddeb7e --- /dev/null +++ b/scripts/check_and_comment.py @@ -0,0 +1,208 @@ +# Copyright (c) ZenML GmbH 2025. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at: +# +# https://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express +# or implied. See the License for the specific language governing +# permissions and limitations under the License. +"""Checks for broken markdown links in a directory and comments on a PR if found.""" + +import json +import os +import re +import sys +from pathlib import Path + + +def format_path_for_display(path): + """Convert absolute path to relative path from repo root.""" + try: + # Get the repo root (parent of scripts directory) + repo_root = Path(__file__).parent.parent + # First resolve the path to remove any ../ components + full_path = Path(path).resolve() + return str(full_path.relative_to(repo_root)) + except ValueError: + # If path is not relative to repo root, return as is + return str(path) + + +def find_markdown_files(directory): + """Recursively find all markdown files in a directory.""" + return list(Path(directory).rglob("*.md")) + + +def extract_relative_links(content): + """Extract all relative markdown links from content.""" + # Match [text](path.md) or [text](../path.md) patterns + # Excluding URLs (http:// or https://) + pattern = r"\[([^\]]+)\]\((?!http[s]?://)(.[^\)]+\.md)\)" + matches = re.finditer(pattern, content) + return [(m.group(1), m.group(2)) for m in matches] + + +def validate_link(source_file, target_path): + """Validate if a relative link is valid.""" + try: + # Convert source file and target path to Path objects + source_dir = Path(source_file).parent + # Resolve the target path relative to the source file's directory + full_path = (source_dir / target_path).resolve() + return full_path.exists() + except Exception: + return False + + +def check_markdown_links(directory): + """Check all markdown files in directory for broken relative links.""" + broken_links = [] + markdown_files = find_markdown_files(directory) + + for file_path in markdown_files: + try: + with open(file_path, "r", encoding="utf-8") as f: + content = f.read() + + relative_links = extract_relative_links(content) + + for link_text, link_path in relative_links: + if not validate_link(file_path, link_path): + broken_links.append( + { + "source_file": str(file_path), + "link_text": link_text, + "broken_path": link_path, + } + ) + except Exception as e: + print(f"Error processing {file_path}: {str(e)}") + + return broken_links + + +def create_comment_body(broken_links): + if not broken_links: + return "✅ No broken markdown links found!" + + # Calculate statistics + total_files = len({link["source_file"] for link in broken_links}) + total_broken = len(broken_links) + + body = [ + "# 🔍 Broken Links Report", + "", + "### Summary", + f"- 📁 Files with broken links: **{total_files}**", + f"- 🔗 Total broken links: **{total_broken}**", + "", + "### Details", + "| File | Link Text | Broken Path |", + "|------|-----------|-------------|", + ] + + # Add each broken link as a table row + for link in broken_links: + # Get parent folder and file name + path = Path(link["source_file"]) + parent = path.parent.name + file_name = path.name + display_name = ( + f"{parent}/{file_name}" # Combine parent folder and filename + ) + + body.append( + f"| `{display_name}` | \"{link['link_text']}\" | `{link['broken_path']}` |" + ) + + body.append("") + body.append("
📂 Full file paths") + body.append("") + for link in broken_links: + body.append(f"- `{link['source_file']}`") + body.append("") + body.append("
") + + return "\n".join(body) + + +def main(): + # Get the directory to check from command line argument + if len(sys.argv) != 2: + print("Usage: python check_and_comment.py ") + sys.exit(1) + + directory = sys.argv[1] + if not os.path.isdir(directory): + print(f"Error: {directory} is not a valid directory") + sys.exit(1) + + print(f"Checking markdown links in {directory}...") + broken_links = check_markdown_links(directory) + + # If running in GitHub Actions, handle PR comment + if "GITHUB_TOKEN" in os.environ: + # Only import github when needed + from github import Github + + token = os.environ.get("GITHUB_TOKEN") + if not token: + print("Error: GITHUB_TOKEN not set") + sys.exit(1) + + with open(os.environ["GITHUB_EVENT_PATH"]) as f: + event = json.load(f) + + repo_name = event["repository"]["full_name"] + pr_number = event["pull_request"]["number"] + + g = Github(token) + repo = g.get_repo(repo_name) + pr = repo.get_pull(pr_number) + + comment_body = create_comment_body(broken_links) + + # Find existing comment by looking for our specific header + existing_comment = None + for comment in pr.get_issue_comments(): + if ( + "# 🔍 Broken Links Report" in comment.body + or "✅ No broken markdown links found!" in comment.body + ): + existing_comment = comment + break + + # Update existing comment or create new one + if existing_comment: + existing_comment.edit(comment_body) + print("Updated existing broken links report comment") + else: + pr.create_issue_comment(comment_body) + print("Created new broken links report comment") + + # In GitHub Actions, always exit with 0 after commenting + sys.exit(0) + + # For local runs, print results and exit with appropriate code + if not broken_links: + print("✅ No broken links found!") + sys.exit(0) + + print("\n🔍 Broken links found:") + for link in broken_links: + relative_path = format_path_for_display(link["source_file"]) + print(f"\n📄 File: {relative_path}") + print(f"📝 Link text: \"{link['link_text']}\"") + print(f"❌ Broken path: {link['broken_path']}") + + # Only exit with error code in local mode + sys.exit(1) + + +if __name__ == "__main__": + main() diff --git a/scripts/check_and_comment.sh b/scripts/check_and_comment.sh new file mode 100755 index 00000000000..ca3b5123532 --- /dev/null +++ b/scripts/check_and_comment.sh @@ -0,0 +1,29 @@ +#!/usr/bin/env bash + +# Exit on error +set -e + +# Get the directory containing this script +SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" + +# Default to docs directory if no argument provided +CHECK_DIR="${1:-docs/book}" + +# Convert to absolute path if relative path provided +if [[ ! "$CHECK_DIR" = /* ]]; then + CHECK_DIR="$SCRIPT_DIR/../$CHECK_DIR" +fi + +# Ensure the directory exists +if [ ! -d "$CHECK_DIR" ]; then + echo "Error: Directory '$CHECK_DIR' does not exist" + exit 1 +fi + +# Only install PyGithub if we're running in GitHub Actions +if [ -n "$GITHUB_TOKEN" ]; then + pip install PyGithub +fi + +# Run the Python script +python "$SCRIPT_DIR/check_and_comment.py" "$CHECK_DIR" \ No newline at end of file