You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Generalize and simplify the vector store interface (and the integration of third party vector stores).
Motivation
Rig's current approach to vector stores and vector search is lacking in multiple ways:
The interface forces developers to use the DocumentEmbeddings type, which is somewhat opinionated and a little over-engineered.
The interface doesn't lend itself well to use cases where a developer already has a populated vector store since the interface expects the vector store to be modeled after DocumentEmbeddings.
The process of integrating new vector stores is convoluted for non-document databases (e.g.: Postgres, LanceDB) since DocumentEmbeddings was designed for document vector stores.
The interface assumes that user's would use Rig constructs (e.g.: DocumentEmbeddings) to populate their vector store.
Proposal
Remove the VectorStore trait and simplify the VectorStoreIndex trait to the following methods only:
pubtraitVectorStoreIndex:Send + Sync{/// Get the top n documents based on the distance to the given embedding./// The documents are deserialized into the given type.asyncfntop_n_from_query<T:for<'a>Deserialize<'a>>(&self,query:&str,n:usize,) -> Result<Vec<(f64,T)>,VectorStoreError>;/// Same as `top_n_from_query` but returns the document ids only.asyncfntop_n_ids_from_query(&self,query:&str,n:usize,) -> Result<Vec<(f64,String)>,VectorStoreError>;/// Get the top n documents based on the distance to the given embedding./// The documents are deserialized into the given type.asyncfntop_n_from_embedding<T:for<'a>Deserialize<'a>>(&self,embedding:&Embedding,n:usize,) -> Result<Vec<(f64,T)>,VectorStoreError>;/// Same as `top_n_from_embedding` but returns the document ids only.asyncfntop_n_ids_from_embedding(&self,embedding:&Embedding,n:usize,) -> Result<Vec<(f64,String)>,VectorStoreError>;}
Remove the DocumentEmbeddings type entirely.
Update the Agent type accordingly (we could enforce that the type T which is stored in the vector store also implements ToString so that the we can easily insert the dynamic context in the agent's prompt)
Feature Request
Generalize and simplify the vector store interface (and the integration of third party vector stores).
Motivation
Rig's current approach to vector stores and vector search is lacking in multiple ways:
DocumentEmbeddings
type, which is somewhat opinionated and a little over-engineered.DocumentEmbeddings
.DocumentEmbeddings
was designed for document vector stores.DocumentEmbeddings
) to populate their vector store.Proposal
VectorStore
trait and simplify theVectorStoreIndex
trait to the following methods only:DocumentEmbeddings
type entirely.Agent
type accordingly (we could enforce that the typeT
which is stored in the vector store also implementsToString
so that the we can easily insert the dynamic context in the agent's prompt)EmbeddingsBuilder
type accordinglyAlternatives
Open to alternatives
Implementation Checklist
DocumentEmbeddings
type #40The text was updated successfully, but these errors were encountered: