Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add rq worker #1011

Merged
merged 31 commits into from
Dec 9, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
5513f7a
feat: only accept predictions from one product during insight import
raphael0202 Dec 1, 2022
727d561
feat: use rq to send tasks to workers
raphael0202 Dec 1, 2022
61e77b0
feat: add redis lock to avoid concurrent insight import for the same …
raphael0202 Dec 1, 2022
4b7cb33
style: fix flake8 and isort errors
raphael0202 Dec 5, 2022
ed663f5
feat: add a lock during product update jobs
raphael0202 Dec 5, 2022
7617476
fix: add default timeout during TF serving request
raphael0202 Dec 5, 2022
34aaa4e
fix: add default TF_SERVING_HOST in .env file
raphael0202 Dec 5, 2022
1b2564d
chore: fix config discrepancy between ML local/prod
raphael0202 Dec 6, 2022
bdcb4ef
feat: simplify image import process
raphael0202 Dec 6, 2022
f153652
feat: create more atomic tasks during image import
raphael0202 Dec 7, 2022
4c8fd58
fix: use autoconnect=False during DB connection
raphael0202 Dec 7, 2022
2696288
fix: fix DB connection in workers
raphael0202 Dec 7, 2022
a349543
feat: make object detection jobs idempotent
raphael0202 Dec 8, 2022
2341127
feat: add more logs during resource loading
raphael0202 Dec 8, 2022
82630ee
fix: switch from redis-stack to classic redis
raphael0202 Dec 8, 2022
838f9e1
feat: refresh cache when performing worker maintenance tasks
raphael0202 Dec 8, 2022
fa3e2ef
fix: disable deepsource autofix
raphael0202 Dec 8, 2022
31d5b84
fix: update poetry lock file
raphael0202 Dec 8, 2022
143cf94
feat: use queues for refresh_insight job on all DB
raphael0202 Dec 8, 2022
1ea52de
fix: use db.connection_context in with_db decorator
raphael0202 Dec 8, 2022
cbb970e
feat: add new CLI commands to launch background tasks
raphael0202 Dec 8, 2022
8cc2b22
feat: add two kind of workers: worker_high and worker_low
raphael0202 Dec 8, 2022
2809881
fix: fix flake8 and isort errors
raphael0202 Dec 8, 2022
09b5fdd
feat: use batches in refresh_insights job
raphael0202 Dec 8, 2022
81357b7
feat: remove orphan containers when doing make up
raphael0202 Dec 8, 2022
761030a
fix: no need to use db.atomic() when db context manager is used
raphael0202 Dec 8, 2022
06bf2e1
feat: improve CLI import commands
raphael0202 Dec 8, 2022
fc1f2a4
docs: improve documentation
raphael0202 Dec 8, 2022
e1c3132
fix: fix integration tests after adding autoconnect=False
raphael0202 Dec 9, 2022
5a5194a
fix: open DB connection when needed in scheduler
raphael0202 Dec 9, 2022
a7eb556
fix: add with_db decorator to generate_fiber_quality_facet
raphael0202 Dec 9, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .deepsource.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ runtime_version = "3.x.x"

[[transformers]]
name = "black"
enabled = true
enabled = false

[[transformers]]
name = "isort"
enabled = true
enabled = false
raphael0202 marked this conversation as resolved.
Show resolved Hide resolved
14 changes: 6 additions & 8 deletions .env
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,9 @@ POSTGRES_USER=postgres
POSTGRES_PASSWORD=postgres
POSTGRES_EXPOSE=127.0.0.1:5432

# Triton ML inference server
# Triton ML inference server & TF Serving
TRITON_HOST=triton
TF_SERVING_HOST=tf_serving

# InfluxDB
INFLUXDB_HOST=
Expand All @@ -62,15 +63,12 @@ INFLUXDB_AUTH_TOKEN=
# MONGO_URI=mongodb://mongodb.po_default:27017
MONGO_URI=mongodb://mongodb:27017

# Redis
REDIS_HOST=redis

# OpenFoodFacts API
OFF_PASSWORD=
OFF_USER=

# Utils
SENTRY_DSN=

# Workers
IPC_AUTHKEY=ipc
IPC_HOST=workers
IPC_PORT=6650
WORKER_COUNT=8
SENTRY_DSN=
2 changes: 1 addition & 1 deletion .github/workflows/container-deploy-ml.yml
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ jobs:
echo "COMPOSE_HTTP_TIMEOUT=120" >> .env
echo "COMPOSE_PATH_SEPARATOR=;" >> .env
echo "COMPOSE_PROJECT_NAME=robotoff-ml" >> .env
echo "COMPOSE_FILE=docker/ml.yml" >> .env
echo "COMPOSE_FILE=docker-compose.yml;docker/ml.yml" >> .env
raphael0202 marked this conversation as resolved.
Show resolved Hide resolved
echo "RESTART_POLICY=always" >> .env
echo "TRITON_EXPOSE_HTTP=8003" >> .env

Expand Down
5 changes: 1 addition & 4 deletions .github/workflows/container-deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -111,10 +111,7 @@ jobs:
# Set app variables
echo "ROBOTOFF_INSTANCE=${{ env.ROBOTOFF_INSTANCE }}" >> .env
echo "ROBOTOFF_DOMAIN=${{ env.ROBOTOFF_DOMAIN }}" >> .env
echo "IPC_AUTHKEY=${{ secrets.IPC_AUTHKEY }}" >> .env
echo "IPC_HOST=0.0.0.0" >> .env
echo "IPC_PORT=6650" >> .env
echo "WORKER_COUNT=8" >> .env
echo "REDIS_HOST=redis" >> .env
echo "POSTGRES_HOST=postgres" >> .env
echo "POSTGRES_DB=postgres" >> .env
echo "POSTGRES_USER=postgres" >> .env
Expand Down
10 changes: 5 additions & 5 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ up:
@echo "🥫 Building and starting containers …"
docker network create po_default || true
ifdef service
${DOCKER_COMPOSE} up -d ${service} 2>&1
${DOCKER_COMPOSE} up --remove-orphans -d ${service} 2>&1
else
${DOCKER_COMPOSE} up -d 2>&1
endif
Expand Down Expand Up @@ -173,26 +173,26 @@ health:
i18n-compile:
@echo "🥫 Compiling translations …"
# Note it's important to have --no-deps, to avoid launching a concurrent postgres instance
${DOCKER_COMPOSE} run --rm --entrypoint bash --no-deps workers -c "cd i18n && . compile.sh"
${DOCKER_COMPOSE} run --rm --entrypoint bash --no-deps worker_high -c "cd i18n && . compile.sh"

unit-tests:
@echo "🥫 Running tests …"
# run tests in worker to have more memory
# also, change project name to run in isolation
${DOCKER_COMPOSE_TEST} run --rm workers poetry run pytest --cov-report xml --cov=robotoff tests/unit
${DOCKER_COMPOSE_TEST} run --rm worker_high poetry run pytest --cov-report xml --cov=robotoff tests/unit

integration-tests:
@echo "🥫 Running integration tests …"
# run tests in worker to have more memory
# also, change project name to run in isolation
${DOCKER_COMPOSE_TEST} run --rm workers poetry run pytest -vv --cov-report xml --cov=robotoff --cov-append tests/integration
${DOCKER_COMPOSE_TEST} run --rm worker_high poetry run pytest -vv --cov-report xml --cov=robotoff --cov-append tests/integration
( ${DOCKER_COMPOSE_TEST} down -v || true )

# interactive testings
# usage: make pytest args='test/unit/my-test.py --pdb'
pytest: guard-args
@echo "🥫 Running test: ${args} …"
${DOCKER_COMPOSE_TEST} run --rm workers poetry run pytest ${args}
${DOCKER_COMPOSE_TEST} run --rm worker_high poetry run pytest ${args}

#------------#
# Production #
Expand Down
2 changes: 1 addition & 1 deletion doc/how-to-guides/deployment/maintenance.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ robotoff_api_1 /bin/sh -c /docker-entrypo ... Up 0.0.0.0:5500->55
/tcp
robotoff_postgres_1 docker-entrypoint.sh postg ... Up 127.0.0.1:5432->5432/tcp
robotoff_scheduler_1 /bin/sh -c /docker-entrypo ... Up
robotoff_workers_1 /bin/sh -c /docker-entrypo ... Up
robotoff_worker_low_1 /bin/sh -c /docker-entrypo ... Up
raphael0202 marked this conversation as resolved.
Show resolved Hide resolved
```

## Database backup and restore
Expand Down
8 changes: 4 additions & 4 deletions doc/introduction/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@ Robotoff is made of several services:
- the public _API_ service
- the _scheduler_, responsible for launching recurrent tasks (downloading new dataset, processing insights automatically,...) [^scheduler]
- the _workers_, responsible for all long-lasting tasks
- a _redis_ instance

Communication between API and Workers happens through ipc events. [^ipc_events]
Communication between API and workers happens through Redis DB using [rq](https://python-rq.org). [^worker_job]
raphael0202 marked this conversation as resolved.
Show resolved Hide resolved

[^scheduler]: See `scheduler.run`

[^ipc_events]: See `robotoff.workers.client` and `robotoff.workers.listener`
[^worker_job]: See `robotoff.workers.queues` and `robotoff.workers.tasks`

Robotoff allows to predict many information (also called _insights_), mostly from the product images or OCR.

Expand Down Expand Up @@ -58,7 +58,7 @@ Some insights with high confidence are applied automatically, 10 minutes after i

Robotoff is also notified by Product Opener every time a product is updated or deleted [^product_update]. This is used to delete insights associated with deleted products, or to update them accordingly.

[^product_update]: see `workers.tasks.product_updated` and `workers.tasks.delete_product_insights`
[^product_update]: see `workers.tasks.product_updated` and `workers.tasks.delete_product_insights_job`
[^annotate]: see `robotoff.insights.annotate`


Expand Down
57 changes: 42 additions & 15 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,25 +1,23 @@
version: "3.9"


x-robotoff-base: &robotoff-base
x-robotoff-base:
&robotoff-base
restart: $RESTART_POLICY
image: ghcr.io/openfoodfacts/robotoff:${TAG}
volumes:
- ./datasets:/opt/robotoff/datasets
- ./tf_models:/opt/robotoff/tf_models
- ./models:/opt/robotoff/models

x-robotoff-base-env: &robotoff-base-env
x-robotoff-base-env:
&robotoff-base-env
ROBOTOFF_INSTANCE:
ROBOTOFF_DOMAIN:
ROBOTOFF_SCHEME:
STATIC_OFF_DOMAIN:
GUNICORN_NUM_WORKERS:
IPC_AUTHKEY:
IPC_HOST: workers
IPC_PORT:
WORKER_COUNT:
ROBOTOFF_UPDATED_PRODUCT_WAIT:
REDIS_HOST:
POSTGRES_HOST:
POSTGRES_DB:
POSTGRES_USER:
Expand All @@ -43,19 +41,33 @@ services:
<<: *robotoff-base
environment: *robotoff-base-env
mem_limit: 2g
depends_on:
- workers
ports:
- "${ROBOTOFF_EXPOSE:-5500}:5500"
networks:
- webnet

workers:
worker_high:
<<: *robotoff-base
command: poetry run robotoff-cli run workers
environment:
<<: *robotoff-base-env
REAL_TIME_IMAGE_PREDICTION: 1
deploy:
mode: replicated
replicas: 6
command: poetry run robotoff-cli run-worker robotoff-high
environment: *robotoff-base-env
depends_on:
- postgres
mem_limit: 8g
networks:
- webnet
extra_hosts:
- host.docker.internal:host-gateway
raphael0202 marked this conversation as resolved.
Show resolved Hide resolved

worker_low:
<<: *robotoff-base
deploy:
mode: replicated
replicas: 2
command: poetry run robotoff-cli run-worker robotoff-low robotoff-high
environment: *robotoff-base-env
depends_on:
- postgres
mem_limit: 8g
Expand All @@ -67,7 +79,7 @@ services:
scheduler:
<<: *robotoff-base
environment: *robotoff-base-env
command: poetry run robotoff-cli run scheduler
command: poetry run robotoff-cli run-scheduler
mem_limit: 4g
networks:
- webnet
Expand All @@ -89,6 +101,19 @@ services:
networks:
- webnet

redis:
restart: $RESTART_POLICY
image: redis:7.0.5-alpine
volumes:
- redis-data:/data
environment:
REDIS_ARGS: --save 60 1000 --appendonly yes
mem_limit: 4g
ports:
- "${REDIS_EXPOSE:-127.0.0.1:6379}:6379"
networks:
- webnet

elasticsearch:
restart: $RESTART_POLICY
image: raphael0202/elasticsearch
Expand All @@ -113,6 +138,8 @@ services:
volumes:
postgres-data:
es-data:
redis-data:
name: ${COMPOSE_PROJECT_NAME:-robotoff}_redis-data

networks:
webnet:
Expand Down
13 changes: 12 additions & 1 deletion docker/dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,9 +50,20 @@ services:
- robotoff.openfoodfacts.localhost
- api
webnet:
workers:
worker_high:
<<: *robotoff-dev
<<: *networks-productopener-local
deploy:
mode: replicated
# Only 1 replica is easier to deal with for local dev
replicas: 1
worker_low:
<<: *robotoff-dev
<<: *networks-productopener-local
deploy:
mode: replicated
# Only 1 replica is easier to deal with for local dev
replicas: 1
scheduler:
<<: *networks-productopener-local
<<: *robotoff-dev
Expand Down
7 changes: 2 additions & 5 deletions docker/ml.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ services:
- 8501:8501
- 8500:8500
volumes:
- ../tf_models:/models
- ./tf_models:/models
entrypoint: "tensorflow_model_server --port=8500 --rest_api_port=8501 --model_config_file=/models/models.config"
mem_limit: 10g
networks:
Expand All @@ -28,11 +28,8 @@ services:
- ${TRITON_EXPOSE_GRPC:-8001}:8001
- ${TRITON_EXPOSE_METRICS:-8002}:8002
volumes:
- ../models:/models
- ./models:/models
entrypoint: "tritonserver --model-repository=/models"
mem_limit: 10g
networks:
- webnet

networks:
webnet:
Loading