Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/v1/inference/embeddings input and output shape mismatch #922

Closed
mattf opened this issue Feb 1, 2025 · 2 comments · Fixed by #1161
Closed

/v1/inference/embeddings input and output shape mismatch #922

mattf opened this issue Feb 1, 2025 · 2 comments · Fixed by #1161
Assignees
Milestone

Comments

@mattf
Copy link
Contributor

mattf commented Feb 1, 2025

/v1/inference/embeddings: model x List[InterleavedContent] -> List[List[float]]

the shape mismatch comes from InterleavedContent allowing for List[InterleavedContentItem].

example: [string, [text0, text1], image] -?-> [embedding of string, embedding of text0, embedding of text1, embedding of image]

i suggest aligning the shapes.

my preference is to change the input shape, and use an input of array of string | array of InterleavedContentItem, which keeps string (untyped) and text / image (typed) inputs separate.

a further enhancement: embedding is often done in two modes, batch and query. in batch mode many items are embedded for storage. in query mode a single item is embedded for lookup. allowing input of string | array of string | array of InterleavedContentItem facilitates this use case.

@mattf
Copy link
Contributor Author

mattf commented Feb 1, 2025

cc @raghotham @ashwinb @yanxi0830

@ashwinb
Copy link
Contributor

ashwinb commented Feb 3, 2025

Good spot. I definitely agree with at least changing it to List[InterleavedContentItem] immediately otherwise the contract is broken.

@hardikjshah hardikjshah added this to the v0.1.4 milestone Feb 12, 2025
ashwinb added a commit that referenced this issue Feb 21, 2025
#1161)

See Issue #922 

The change is slightly backwards incompatible but no callsite (in our
client codebases or stack-apps) every passes a depth-2
`List[List[InterleavedContentItem]]` (which is now disallowed.)

## Test Plan

```bash
$ cd llama_stack/providers/tests/inference
$ pytest -s -v -k fireworks test_embeddings.py \
   --inference-model nomic-ai/nomic-embed-text-v1.5 --env EMBEDDING_DIMENSION=784
$  pytest -s -v -k together test_embeddings.py \
   --inference-model togethercomputer/m2-bert-80M-8k-retrieval --env EMBEDDING_DIMENSION=784
$ pytest -s -v -k ollama test_embeddings.py \
   --inference-model all-minilm:latest --env EMBEDDING_DIMENSION=784
```

Also ran `tests/client-sdk/inference/test_embeddings.py`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants