Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Temporal workflows #61

Merged
merged 13 commits into from
Apr 16, 2024
5 changes: 5 additions & 0 deletions .github/workflows/test_oonipipeline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,11 @@ jobs:
sudo apt-get update
sudo apt-get install -y clickhouse-server clickhouse-client

- name: Install temporal
run: |
curl -sSf https://temporal.download/cli.sh | sh
echo "$HOME/.temporalio/bin" >> $GITHUB_PATH

- name: Run all tests
run: hatch run cov
working-directory: ./oonipipeline/
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,4 @@ coverage.xml
/output
/attic
/prof
/clickhouse-data
39 changes: 39 additions & 0 deletions oonipipeline/Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# OONI Pipeline v5

This it the fifth major iteration of the OONI Data Pipeline.

For historical context, these are the major revisions:
* `v0` - The "pipeline" is basically just writing the RAW json files into a public `www` directory. Used until ~2013
* `v1` - OONI Pipeline based on custom CLI scripts using mongodb as a backend. Used until ~2015.
* `v2` - OONI Pipeline based on [luigi](https://luigi.readthedocs.io/en/stable/). Used until ~2017.
* `v3` - OONI Pipeline based on [airflow](https://airflow.apache.org/). Used until ~2020.
* `v4` - OONI Pipeline basedon custom script and systemd units (aka fastpath). Currently in use in production.
* `v5` - Next generation OONI Pipeline. What this readme is relevant to. Expected to become in production by Q4 2024.

## Setup

In order to run the pipeline you should setup the following dependencies:
* [Temporal for python](https://learn.temporal.io/getting_started/python/dev_environment/)
* [Clickhouse](https://clickhouse.com/docs/en/install)
* [hatch](https://hatch.pypa.io/1.9/install/)


### Quick start

Start temporal dev server:
```
temporal server start-dev
```

Start clickhouse server:
```
mkdir -p clickhouse-data
clickhouse server
```

You can then start the desired workflow, for example to create signal observations for the US:
```
hatch run oonipipeline mkobs --probe-cc US --test-name signal --start-day 2024-01-01 --end-day 2024-01-02
```

Monitor the workflow executing by accessing: http://localhost:8233/
12 changes: 0 additions & 12 deletions oonipipeline/debug-temporal.sh

This file was deleted.

1 change: 1 addition & 0 deletions oonipipeline/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ path = ".venv/"
path = "src/oonipipeline/__about__.py"

[tool.hatch.envs.default.scripts]
oonipipeline = "python -m oonipipeline.main {args}"
test = "pytest {args:tests}"
test-cov = "pytest -s --full-trace --log-level=INFO --log-cli-level=INFO -v --setup-show --cov=./ --cov-report=xml --cov-report=html --cov-report=term {args:tests}"
cov-report = ["coverage report"]
Expand Down
Empty file.
Loading
Loading