Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ort crate version to 2.0.0-rc.4 to support onnx IR version 10 #361

Merged
merged 1 commit into from
Sep 17, 2024

Conversation

kozistr
Copy link
Contributor

@kozistr kozistr commented Jul 26, 2024

What does this PR do?

Fixes #355

  • i guess IR version 10 is supported from onnxruntime 1.8.0, and it is used from 2.0.0-rc.3 version. So, upgrade to the latest version, 2.0.0-rc.4.
$ ./target/release/text-embeddings-router --model-id dunzhang/stella_en_400M_v5 --revision refs/pr/3 --port 8080
2024-07-26T16:50:55.128433Z  INFO text_embeddings_router: router/src/main.rs:175: Args { model_id: "dun*****/******_**_***M_v5", revision: Some("refs/pr/3"), tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, auto_truncate: false, default_prompt_name: None, default_prompt: None, hf_api_token: None, hostname: "0.0.0.0", port: 8080, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: None, payload_limit: 2000000, api_key: None, json_output: false, otlp_endpoint: None, otlp_service_name: "text-embeddings-inference.server", cors_allow_origin: None }
2024-07-26T16:50:55.139343Z  INFO hf_hub: /home/zero/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hf-hub-0.3.2/src/lib.rs:55: Token file not found "/home/zero/.cache/huggingface/token"
2024-07-26T16:50:55.211483Z  INFO download_pool_config: text_embeddings_core::download: core/src/download.rs:38: Downloading `1_Pooling/config.json`
2024-07-26T16:50:55.213326Z  INFO download_new_st_config: text_embeddings_core::download: core/src/download.rs:62: Downloading `config_sentence_transformers.json`
2024-07-26T16:50:55.213366Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:21: Starting download
2024-07-26T16:50:55.213371Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:23: Downloading `config.json`
2024-07-26T16:50:55.213381Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:26: Downloading `tokenizer.json`
2024-07-26T16:50:55.217605Z  INFO download_artifacts: text_embeddings_backend: backends/src/lib.rs:368: Downloading `model.onnx`
2024-07-26T16:50:55.482762Z  WARN download_artifacts: text_embeddings_backend: backends/src/lib.rs:372: Could not download `model.onnx`: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/dunzhang/stella_en_400M_v5/resolve/refs%2Fpr%2F3/model.onnx)
2024-07-26T16:50:55.482789Z  INFO download_artifacts: text_embeddings_backend: backends/src/lib.rs:373: Downloading `onnx/model.onnx`
2024-07-26T16:50:55.483067Z  INFO download_artifacts: text_embeddings_backend: backends/src/lib.rs:379: Downloading `model.onnx_data`
2024-07-26T16:50:55.673022Z  WARN download_artifacts: text_embeddings_backend: backends/src/lib.rs:383: Could not download `model.onnx_data`: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/dunzhang/stella_en_400M_v5/resolve/refs%2Fpr%2F3/model.onnx_data)
2024-07-26T16:50:55.673127Z  INFO download_artifacts: text_embeddings_backend: backends/src/lib.rs:384: Downloading `onnx/model.onnx_data`
2024-07-26T16:50:55.866739Z  WARN download_artifacts: text_embeddings_backend: backends/src/lib.rs:388: Could not download `onnx/model.onnx_data`: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/dunzhang/stella_en_400M_v5/resolve/refs%2Fpr%2F3/onnx/model.onnx_data)
2024-07-26T16:50:55.866772Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:32: Model artifacts downloaded in 653.40509ms
2024-07-26T16:50:55.888893Z  INFO text_embeddings_router: router/src/lib.rs:199: Maximum number of tokens per request: 512
2024-07-26T16:50:55.894090Z  INFO text_embeddings_core::tokenization: core/src/tokenization.rs:28: Starting 4 tokenization workers
2024-07-26T16:50:55.913947Z  INFO text_embeddings_router: router/src/lib.rs:241: Starting model backend
Error: Model backend is not healthy

Caused by:
    Unknown output keys: [Output { name: "sentence_embedding", output_type: Tensor { ty: Float32, dimensions: [-1, 1024] } }]

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@OlivierDehaene OR @Narsil

Copy link
Member

@OlivierDehaene OlivierDehaene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@OlivierDehaene OlivierDehaene merged commit df03195 into huggingface:main Sep 17, 2024
@kozistr kozistr deleted the update/ort branch September 18, 2024 03:22
MasakiMu319 pushed a commit to MasakiMu319/text-embeddings-inference that referenced this pull request Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unsupported model IR version
2 participants