Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(api): Add options for supporting various embedding models #1192

Merged
merged 3 commits into from
Feb 21, 2025
Merged

Conversation

ashwinb
Copy link
Contributor

@ashwinb ashwinb commented Feb 21, 2025

We need to support:

Test Plan

$ cd llama_stack/providers/tests/inference
$ pytest -s -v -k fireworks test_embeddings.py \
   --inference-model nomic-ai/nomic-embed-text-v1.5 --env EMBEDDING_DIMENSION=784
$  pytest -s -v -k together test_embeddings.py \
   --inference-model togethercomputer/m2-bert-80M-8k-retrieval --env EMBEDDING_DIMENSION=784
$ pytest -s -v -k ollama test_embeddings.py \
   --inference-model all-minilm:latest --env EMBEDDING_DIMENSION=784

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Feb 21, 2025
@ashwinb ashwinb changed the title rfc: Add options for supporting various embedding models feat(api): Add options for supporting various embedding models Feb 21, 2025

class EmbeddingOptions(BaseModel):
dimensions: Optional[int] = None
text_truncation: Optional[TextTruncation] = None
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no image truncation is provided here, images are naturally pre-processed (which means resized / cropped) depending on the settings of each model typically. when a specific need arises to control this pre-processing, we may add an option here. cc @mattf

Copy link
Contributor Author

@ashwinb ashwinb Feb 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ehhuang default behavior should be to throw. is that communicated clearly by the value "None" or should it be an enum throw (or some other value)?

end = "end"


class EmbeddingContext(Enum):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is "Context" a good enough name? any better suggestion?

@@ -482,11 +506,17 @@ async def embeddings(
self,
model_id: str,
contents: List[InterleavedContent],
text_truncation: Optional[TextTruncation] = TextTruncation.none,
dimensions: Optional[int] = None,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicking: should this be output_dimensions? dimensionality?

Copy link
Contributor

@ehhuang ehhuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG, minor comments on naming

Hoist options into the method args directly. Make TextTrunction.none be explicit.

make text_truncation optional

rename
@ashwinb ashwinb merged commit 81ce39a into main Feb 21, 2025
5 checks passed
@ashwinb ashwinb deleted the embed_api branch February 21, 2025 06:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
3 participants