We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cargo install --path router -F candle-cuda -F http --no-default-features
❯ lsof -i :12345 # shows that nothing is running on the said port. ❯ text-embeddings-router --model-id nomic-ai/nomic-embed-text-v1.5 --port 12345 2025-01-22T10:39:14.154099Z INFO text_embeddings_router: router/src/main.rs:175: Args { model_id: "nom**-**/*****-*****-****-v1.5", revision: None, tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, auto_truncate: false, default_prompt_name: None, default_prompt: None, hf_api_token: None, hostname: "0.0.0.0", port: 12345, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: None, payload_limit: 2000000, api_key: None, json_output: false, otlp_endpoint: None, otlp_service_name: "text-embeddings-inference.server", cors_allow_origin: None } 2025-01-22T10:39:14.236075Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:20: Starting download 2025-01-22T10:39:14.236096Z INFO download_artifacts:download_pool_config: text_embeddings_core::download: core/src/download.rs:53: Downloading `1_Pooling/config.json` 2025-01-22T10:39:14.236141Z INFO download_artifacts:download_new_st_config: text_embeddings_core::download: core/src/download.rs:77: Downloading `config_sentence_transformers.json` 2025-01-22T10:39:14.236155Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:40: Downloading `config.json` 2025-01-22T10:39:14.236167Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:43: Downloading `tokenizer.json` 2025-01-22T10:39:14.236179Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:47: Model artifacts downloaded in 106.335µs 2025-01-22T10:39:14.247515Z INFO text_embeddings_router: router/src/lib.rs:188: Maximum number of tokens per request: 8192 2025-01-22T10:39:14.247554Z INFO text_embeddings_core::tokenization: core/src/tokenization.rs:28: Starting 24 tokenization workers 2025-01-22T10:39:14.343914Z INFO text_embeddings_router: router/src/lib.rs:230: Starting model backend 2025-01-22T10:39:14.344999Z INFO text_embeddings_backend: backends/src/lib.rs:486: Downloading `model.safetensors` 2025-01-22T10:39:14.345044Z INFO text_embeddings_backend: backends/src/lib.rs:370: Model weights downloaded in 46.73µs 2025-01-22T10:39:15.568001Z INFO text_embeddings_backend_candle: backends/candle/src/lib.rs:332: Starting FlashNomicBert model on Cuda(CudaDevice(DeviceId(1))) 2025-01-22T10:39:16.131682Z INFO text_embeddings_router: router/src/lib.rs:248: Warming up model Error: failed to build prometheus recorder Caused by: failed to create HTTP listener: Address already in use (os error 98) ❯ lsof -i :12345
TEI loads up fine for inference without any errors.
The text was updated successfully, but these errors were encountered:
Seems like switching to python 3.12 solved my issue. I noticed in the Dockerfile that python was set to 3.11 and that hinted me to try python 3.12
Should this be documented somewhere?
Sorry, something went wrong.
No branches or pull requests
System Info
cargo install --path router -F candle-cuda -F http --no-default-features
Information
Tasks
Reproduction
Expected behavior
TEI loads up fine for inference without any errors.
The text was updated successfully, but these errors were encountered: