Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add additional env variables for Machine Learning #15326

Merged
merged 30 commits into from
Jan 14, 2025
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
b03f186
Add additional variables to preload part ML models
1-tempest Jan 14, 2025
05fda40
Add additional variables to preload part ML models
1-tempest Jan 14, 2025
41d74f0
Add additional variables to preload part ML models
1-tempest Jan 14, 2025
f67c051
Add additional variables to preload part ML models
1-tempest Jan 14, 2025
a5a9cc6
Add additional variables to preload part ML models
1-tempest Jan 14, 2025
cf27958
Add additional variables to preload part ML models
1-tempest Jan 14, 2025
38068e6
Add additional variables to preload part ML models
1-tempest Jan 14, 2025
c378afb
Add additional variables to preload part ML models
1-tempest Jan 14, 2025
3bd2ec6
Add additional variables to preload part ML models
1-tempest Jan 14, 2025
85e17e4
Update config.py
1-tempest Jan 14, 2025
4a22421
Add additional variables to preload part ML models
1-tempest Jan 14, 2025
5679c4f
Add additional variables to preload part ML models
1-tempest Jan 14, 2025
8556c32
Apply formatting
1-tempest Jan 14, 2025
3970aa7
minor update
1-tempest Jan 14, 2025
007db2b
formatting
1-tempest Jan 14, 2025
6c9d173
root validator
1-tempest Jan 14, 2025
6f79390
minor update
1-tempest Jan 14, 2025
47d53b1
minor update
1-tempest Jan 14, 2025
ef3b303
minor update
1-tempest Jan 14, 2025
b4ea6e9
change to support explicit models
1-tempest Jan 14, 2025
12eff16
minor update
1-tempest Jan 14, 2025
233427a
minor change
1-tempest Jan 14, 2025
648851e
minor change
1-tempest Jan 14, 2025
624445b
minor change
1-tempest Jan 14, 2025
e2b5f71
minor update
1-tempest Jan 14, 2025
71527ea
add logs, resolve errors
1-tempest Jan 14, 2025
8cfa055
minor change
1-tempest Jan 14, 2025
85ca87c
add new enviornment variables
1-tempest Jan 14, 2025
1a26a3f
minor revisons
1-tempest Jan 14, 2025
10446e1
remove comments
1-tempest Jan 14, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 22 additions & 18 deletions docs/docs/install/environment-variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -148,24 +148,28 @@ Redis (Sentinel) URL example JSON before encoding:

## Machine Learning

| Variable | Description | Default | Containers |
| :-------------------------------------------------------- | :-------------------------------------------------------------------------------------------------- | :-----------------------------: | :--------------- |
| `MACHINE_LEARNING_MODEL_TTL` | Inactivity time (s) before a model is unloaded (disabled if \<= 0) | `300` | machine learning |
| `MACHINE_LEARNING_MODEL_TTL_POLL_S` | Interval (s) between checks for the model TTL (disabled if \<= 0) | `10` | machine learning |
| `MACHINE_LEARNING_CACHE_FOLDER` | Directory where models are downloaded | `/cache` | machine learning |
| `MACHINE_LEARNING_REQUEST_THREADS`<sup>\*1</sup> | Thread count of the request thread pool (disabled if \<= 0) | number of CPU cores | machine learning |
| `MACHINE_LEARNING_MODEL_INTER_OP_THREADS` | Number of parallel model operations | `1` | machine learning |
| `MACHINE_LEARNING_MODEL_INTRA_OP_THREADS` | Number of threads for each model operation | `2` | machine learning |
| `MACHINE_LEARNING_WORKERS`<sup>\*2</sup> | Number of worker processes to spawn | `1` | machine learning |
| `MACHINE_LEARNING_HTTP_KEEPALIVE_TIMEOUT_S`<sup>\*3</sup> | HTTP Keep-alive time in seconds | `2` | machine learning |
| `MACHINE_LEARNING_WORKER_TIMEOUT` | Maximum time (s) of unresponsiveness before a worker is killed | `120` (`300` if using OpenVINO) | machine learning |
| `MACHINE_LEARNING_PRELOAD__CLIP` | Name of a CLIP model to be preloaded and kept in cache | | machine learning |
| `MACHINE_LEARNING_PRELOAD__FACIAL_RECOGNITION` | Name of a facial recognition model to be preloaded and kept in cache | | machine learning |
| `MACHINE_LEARNING_ANN` | Enable ARM-NN hardware acceleration if supported | `True` | machine learning |
| `MACHINE_LEARNING_ANN_FP16_TURBO` | Execute operations in FP16 precision: increasing speed, reducing precision (applies only to ARM-NN) | `False` | machine learning |
| `MACHINE_LEARNING_ANN_TUNING_LEVEL` | ARM-NN GPU tuning level (1: rapid, 2: normal, 3: exhaustive) | `2` | machine learning |
| `MACHINE_LEARNING_DEVICE_IDS`<sup>\*4</sup> | Device IDs to use in multi-GPU environments | `0` | machine learning |
| `MACHINE_LEARNING_MAX_BATCH_SIZE__FACIAL_RECOGNITION` | Set the maximum number of faces that will be processed at once by the facial recognition model | None (`1` if using OpenVINO) | machine learning |
| Variable | Description | Default | Containers |
| :---------------------------------------------------------- | :-------------------------------------------------------------------------------------------------- | :-----------------------------: | :--------------- |
| `MACHINE_LEARNING_MODEL_TTL` | Inactivity time (s) before a model is unloaded (disabled if \<= 0) | `300` | machine learning |
| `MACHINE_LEARNING_MODEL_TTL_POLL_S` | Interval (s) between checks for the model TTL (disabled if \<= 0) | `10` | machine learning |
| `MACHINE_LEARNING_CACHE_FOLDER` | Directory where models are downloaded | `/cache` | machine learning |
| `MACHINE_LEARNING_REQUEST_THREADS`<sup>\*1</sup> | Thread count of the request thread pool (disabled if \<= 0) | number of CPU cores | machine learning |
| `MACHINE_LEARNING_MODEL_INTER_OP_THREADS` | Number of parallel model operations | `1` | machine learning |
| `MACHINE_LEARNING_MODEL_INTRA_OP_THREADS` | Number of threads for each model operation | `2` | machine learning |
| `MACHINE_LEARNING_WORKERS`<sup>\*2</sup> | Number of worker processes to spawn | `1` | machine learning |
| `MACHINE_LEARNING_HTTP_KEEPALIVE_TIMEOUT_S`<sup>\*3</sup> | HTTP Keep-alive time in seconds | `2` | machine learning |
| `MACHINE_LEARNING_WORKER_TIMEOUT` | Maximum time (s) of unresponsiveness before a worker is killed | `120` (`300` if using OpenVINO) | machine learning |
| `MACHINE_LEARNING_PRELOAD__CLIP__MODEL` | Name of a CLIP model to be preloaded and kept in cache | | machine learning |
1-tempest marked this conversation as resolved.
Show resolved Hide resolved
| `MACHINE_LEARNING_PRELOAD__CLIP__TEXTUAL` | Preloads the textual model | `True` | machine learning |
| `MACHINE_LEARNING_PRELOAD__CLIP__VISUAL` | Preloads the visual model | `True` | machine learning |
| `MACHINE_LEARNING_PRELOAD__FACIAL_RECOGNITION__MODEL` | Name of a facial recognition model to be preloaded and kept in cache | | machine learning |
| `MACHINE_LEARNING_PRELOAD__FACIAL_RECOGNITION__RECOGNITION` | Preloads the recognition model | `True` | machine learning |
| `MACHINE_LEARNING_PRELOAD__FACIAL_RECOGNITION__DETECTION` | Preloads the detection model | `True` | machine learning |
| `MACHINE_LEARNING_ANN` | Enable ARM-NN hardware acceleration if supported | `True` | machine learning |
| `MACHINE_LEARNING_ANN_FP16_TURBO` | Execute operations in FP16 precision: increasing speed, reducing precision (applies only to ARM-NN) | `False` | machine learning |
| `MACHINE_LEARNING_ANN_TUNING_LEVEL` | ARM-NN GPU tuning level (1: rapid, 2: normal, 3: exhaustive) | `2` | machine learning |
| `MACHINE_LEARNING_DEVICE_IDS`<sup>\*4</sup> | Device IDs to use in multi-GPU environments | `0` | machine learning |
| `MACHINE_LEARNING_MAX_BATCH_SIZE__FACIAL_RECOGNITION` | Set the maximum number of faces that will be processed at once by the facial recognition model | None (`1` if using OpenVINO) | machine learning |

\*1: It is recommended to begin with this parameter when changing the concurrency levels of the machine learning service and then tune the other ones.

Expand Down
35 changes: 32 additions & 3 deletions machine-learning/app/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,17 +6,46 @@
from socket import socket

from gunicorn.arbiter import Arbiter
from pydantic import BaseModel
from pydantic import BaseModel, Field, root_validator
from pydantic_settings import BaseSettings, SettingsConfigDict
from rich.console import Console
from rich.logging import RichHandler
from uvicorn import Server
from uvicorn.workers import UvicornWorker


class ClipSettings(BaseModel):
model: str | None = None
1-tempest marked this conversation as resolved.
Show resolved Hide resolved
textual: bool = True
visual: bool = True


class FacialRecognitionSettings(BaseModel):
model: str | None = None
recognition: bool = True
detection: bool = True


class PreloadModelData(BaseModel):
clip: str | None = None
facial_recognition: str | None = None
clip: ClipSettings = ClipSettings()
facial_recognition: FacialRecognitionSettings = FacialRecognitionSettings()

# Define fallback environment variables
mertalev marked this conversation as resolved.
Show resolved Hide resolved
clip_model_fallback: str | None = Field(default=None, env="MACHINE_LEARNING_PRELOAD__CLIP")
facial_recognition_model_fallback: str | None = Field(
default=None, env="MACHINE_LEARNING_PRELOAD__FACIAL_RECOGNITION"
)

# Root validator to use fallbacks
@root_validator(pre=True)
def set_models(cls, values: dict) -> dict:
values["clip"]["model"] = values.get("clip", {}).get("model") or values.get("clip_model_fallback")

values["facial_recognition"]["model"] = values.get("facial_recognition", {}).get("model") or values.get(
"facial_recognition_model_fallback"
)

return values


class MaxBatchSize(BaseModel):
Expand Down
37 changes: 25 additions & 12 deletions machine-learning/app/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,19 +76,32 @@ async def lifespan(_: FastAPI) -> AsyncGenerator[None, None]:

async def preload_models(preload: PreloadModelData) -> None:
1-tempest marked this conversation as resolved.
Show resolved Hide resolved
log.info(f"Preloading models: {preload}")
if preload.clip is not None:
model = await model_cache.get(preload.clip, ModelType.TEXTUAL, ModelTask.SEARCH)
await load(model)

model = await model_cache.get(preload.clip, ModelType.VISUAL, ModelTask.SEARCH)
await load(model)

if preload.facial_recognition is not None:
model = await model_cache.get(preload.facial_recognition, ModelType.DETECTION, ModelTask.FACIAL_RECOGNITION)
await load(model)

model = await model_cache.get(preload.facial_recognition, ModelType.RECOGNITION, ModelTask.FACIAL_RECOGNITION)
await load(model)
if preload.clip.model is not None:
if preload.clip.textual:
model = await model_cache.get(preload.clip.model, ModelType.TEXTUAL, ModelTask.SEARCH)
await load(model)

if preload.clip.visual:
model = await model_cache.get(preload.clip.model, ModelType.VISUAL, ModelTask.SEARCH)
await load(model)

if preload.facial_recognition.model is not None:
if preload.facial_recognition.detection:
model = await model_cache.get(
preload.facial_recognition.model,
ModelType.DETECTION,
ModelTask.FACIAL_RECOGNITION,
)
await load(model)

if preload.facial_recognition.recognition:
model = await model_cache.get(
preload.facial_recognition.model,
ModelType.RECOGNITION,
ModelTask.FACIAL_RECOGNITION,
)
await load(model)


def update_state() -> Iterator[None]:
Expand Down
Loading