Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix typos and improve package #125

Merged
merged 1 commit into from
Jul 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 18 additions & 12 deletions package/alert/{{ cookiecutter.package_name }}/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,8 @@ The package code is separated over 3 directories, each with their specific purpo
- `pkgs`: this directory contains utility functions that you can call from your Airflow dags
- `src`: this contains directory contains more complex source code that you want to use across projects.
This code is packaged in a docker container and can be triggered in Airflow by defining a custom task or by adding a failure callback to your DAG.
- `dags`: this directory is actually not used for Conveyor packages but rather for Conveyor projects. Adding this in the same directory allows you to easily develop your package.
Working with both a project and a package in the same directory is described in more detail [here](https://docs.conveyordata.com/how-to-guides/conveyor-packages/best-practices).
- `dags`: this directory is actually *not* used for Conveyor packages but rather for Conveyor projects. Adding this in the same directory allows you to easily develop your package.
Working with both a project and a package in the same directory is described in more detail [here](https://docs.conveyordata.com/how-to-guides/conveyor-packages/best-practices). You can remove the dags directory if you do not need it.

## Getting started
Start using this template as follows:
Expand All @@ -59,19 +59,23 @@ Start using this template as follows:
- name: <packagename>
versions: [0.0.1]
```
- build and deploy your project to an environment: `conveyor project build && conveyor project deploy --environment <some-environment>`.
- build and deploy your project to an environment: `conveyor project build && conveyor project deploy --env <some-environment>`.
If you are developing Airflow tasks in your package, you can also use `conveyor run` to test them.
- trigger the `example-dag-alert-simple-callback` or the `example-dag-alert-complex-callback` dag in Airflow
- make changes to your package and run `conveyor package trial --version 0.0.1` to update the version in Airflow
- make changes to your package and run `conveyor package trial --version 0.0.1` to update the version in Airflow
- when you are happy with the current version, you can create a release of your package using: `conveyor package release --version 0.0.1`

## Concepts
This template package is created to show how packages can be used to create common code that can be used
by many projects within your organisation. To illustrate this we show one of the most common usecases, namely: adding alerts to dags.
This template package is created to show how packages can be used to create reusable code across
projects within your organisation. One of the most popular usecases is: adding alerts to dags.

In a package you have 2 places to add common code, namely: `pkgs` directory and `src` directory.
In the next sections, we provide some guidance on when to use which directory.

### when to add functions to pkgs directory
Add functions to the pkgs directory when they are simple and can execute within an Airflow worker.
Typical usecases are wrapper functions/operators that abstract away some custom logic within your organisation.
If you need additional python packages to execute your logic, this approach will not work as you cannot customize the Airflow python environment.
Add functions to the `pkgs` directory when they are simple and can execute within an Airflow worker.
Typical usecases are wrapper functions/operators that abstract away some Airflow logic.
If you need additional python dependencies to execute your logic, this approach will not work as you cannot customize the Airflow python environment.

#### Steps
- Create python functions in your package that will be processed by Airflow
Expand All @@ -80,11 +84,13 @@ If you need additional python packages to execute your logic, this approach will

### creating a docker image with common functionality
For more advanced usecases it might be needed to run it using a custom container.
Here you have full flexibility on which python packages that you want to use.
Here you have full flexibility on which python dependencies that you want to use.
A typical usecase here is to add code for loading data from your datawarehouse.
Using this common logic, every project can create a task to copy the necessary data as a starting point of the pipeline.

#### Steps
- Write the necessary source code for your package
- Make sure you have a `Dockerfile` and package your src code in the Docker image
- Write a python function in the `pkgs` directory. This will run a container to execute your source code using image: `packages.image()`.
- Write a python function in the `pkgs` directory. This will run a container to execute your package source code using a Docker image as follows: `packages.image()`.
- trail/release your package
- Refer to your package in project dag code using: `common_alert = packages.load("common.alert", trial=True)` and trigger one of your common functions
- Refer to your package in project dag code using: `common_alert = packages.load("common.alert", version=1.0.0, trial=True)` and trigger one of your common functions
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
}

with DAG(
"example-dag-alert-complex-callback",
"{{ cookiecutter.package_name }}",
default_args={
**default_args,
"on_failure_callback": complex_alert.complex_failure_alert,
Expand All @@ -34,7 +34,7 @@
)

with DAG(
"example-dag-alert-simple-callback",
"{{ cookiecutter.package_name }}",
default_args={
**default_args,
"on_failure_callback": simple_alert.simple_slack_alert,
Expand Down