Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Add Request and Response types for embeddings #78

Open
1 task done
cvauclair opened this issue Oct 24, 2024 · 0 comments
Open
1 task done

refactor: Add Request and Response types for embeddings #78

cvauclair opened this issue Oct 24, 2024 · 0 comments
Labels

Comments

@cvauclair
Copy link
Contributor

  • I have looked for existing issues (including closed) about this

Feature Request

Bring the low-level embedding API closer to the completion API in terms of completeness and features.

Motivation

Unlike the completion API which is neatly divided into low level types/traits (e.g.: CompletionRequest and CompletionResponse types, CompletionModel trait, etc.) and high level types/traits (e.g.: Chat and Prompt traits), the embeddings API does not have such a distinction.

Instead, the embeddings API is centered around the EmbeddingModel trait (which implements a high level interface which is more analogous to the Prompt trait, than the low level CompletionModel trait) and the EmbeddingsBuilder which is analogous to the CompletionRequestBuilder but with a higher level interface compared to the it's completion API counterpart.

This leads to major drawbacks:

  1. The types/traits become bloated as they need to implement both low and high level functionality
  2. Since the low level request and response types are missing, Rig does not provide any way for users to track things like embedding model token usage unlike the low level completion API

Proposal

  • Add EmbeddingRequest type
  • Add EmbeddingResponse type
  • Rename EmbeddingsBuilder to EmbeddingRequestBuilder
    • Change the build() method to return an EmbeddingRequest
    • Add the send() method
  • Change the EmbeddingModel to the following:
    pub trait EmbeddingModel: Clone + Send + Sync {
        /// The raw response type returned by the underlying embedding model.
        type Response: Send + Sync;
    
        /// Generates an embedding response for the given embedding request.
        fn embedding(
            &self,
            request: EmbeddingRequest,
        ) -> impl std::future::Future<Output = Result<EmbeddingResponse<Self::Response>, EmbeddingError>>
               + Send;
    
        /// Generates a embedding request builder.
        fn embedding_request(&self, prompt: &str) -> EmbeddingRequestBuilder<Self> {
            EmbeddingRequestBuilder::new(self.clone())
        }
    }
    • Note: This is analogous to the high level CompletionModel trait from the completion API
  • Add new Embedding trait (final name tbd)
    pub trait Embedding: Send + Sync {
        /// Generates embeddings for the given documents
        fn embedding<T: Embed + Send>(
            &self, 
            documents: impl IntoIterator<Item = T>,
        ) -> impl std::future::Future<Output = Result<Vec<(T, OneOrMany<Embedding>)>, EmbeddingError>>
               + Send;
    }
    • Note: This is analogous to the high level Prompt trait from the completion API

Alternatives

N/A

@mateobelanger mateobelanger added this to the v0.5 milestone Nov 12, 2024
@cvauclair cvauclair removed this from the v0.5 milestone Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants