Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Add generic type to VectorStoreIndex trait #58

Open
1 task done
cvauclair opened this issue Oct 15, 2024 · 0 comments
Open
1 task done

refactor: Add generic type to VectorStoreIndex trait #58

cvauclair opened this issue Oct 15, 2024 · 0 comments
Assignees
Labels

Comments

@cvauclair
Copy link
Contributor

  • I have looked for existing issues (including closed) about this

Feature Request

Refactor the VectorStoreIndex trait to add a generic type representing the type documents stored in the store. This would remove the generic type of the top_n method.

Motivation

This goal of this change is to improve the developer experience while working with vector stores. Specifically, it solves the problem where developers have to define the type associated with a vector store twice. For instance, with the InMemoryVectorStore, which is itself already parametrized by some generic type D, the type T of the top_n implementation cannot be inferred where it should in fact be the same as the type D of the store! A similar situation occurs with the MongoDbVectorStore, which takes as constructor argument a Collection<T>, which implies that the return type of the top_n method is also T (currently you have to define it twice).

Proposal

Refactor the VectorStoreIndex trait like so:

pub trait VectorStoreIndex<T: for<'a> Deserialize<'a>>: Send + Sync {
    /// Get the top n documents based on the distance to the given query.
    /// The result is a list of tuples of the form (score, id, document)
    fn top_n(
        &self,
        query: &str,
        n: usize,
    ) -> impl std::future::Future<Output = Result<Vec<(f64, String, T)>, VectorStoreError>> + Send;

    /// Same as `top_n` but returns the document ids only.
    fn top_n_ids(
        &self,
        query: &str,
        n: usize,
    ) -> impl std::future::Future<Output = Result<Vec<(f64, String)>, VectorStoreError>> + Send;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants