Cedar provides an easy to setup, in-memory vector database that you can embed in your Rust application.
// 1. initialize db
let db = DuckDB::new(Default::default())?;
db.init()?;
// 2. initialize embedding function (could be OpenAI, Chrome, etc)
let embedding_fn = SentenceTransformerEmbeddings::new();
// Or use OpenAI embeddings:
let embedding_fn = OpenAIEmbeddingFunction::new(
"<api_key>".to_string(),
);
// 3. initialize client
let mut client = LocalClient::init(db, embedding_fn)?;
// 4. create a collection
let mut collection = client.create_collection("collection1")?;
// 5. push documents to the store
let docs = &[
Document {
text: "this is about macbooks".to_string(),
metadata: json!({ "source": "laptops" }),
id: Uuid::new_v4(),
},
Document {
text: "lychees are better than mangoes".to_string(),
metadata: json!({ "source": "facts" }),
id: Uuid::new_v4(),
},
];
collection.add_documents(docs)?;
// 6. query the vector store for matching documents
let k = 1;
let res = collection.query_documents(&["which one is the better fruit?"], k, json!({ "source": "facts" }))?;
To use cedar in your project, start with adding it to your Cargo.toml
. (Standalone cedar server coming soon!)
[dependencies]
cedar-db = "0.1.0"
uuid = { version = "1.3.2", features = ["v4", "fast-rng"] }
serde_json = "1.0.96"
cedar
uses the tch-rs
bindings for PyTorch. To set up the bindings, follow these steps:
-
Download
libtorch
from https://pytorch.org/get-started/locally/. This package requiresv2.0.0
: if this version is no longer available on the "get started" page, the file should be accessible by modifying the target link, for examplehttps://download.pytorch.org/libtorch/cu118/libtorch-cxx11-abi-shared-with-deps-2.0.0%2Bcu118.zip
for a Linux version with CUDA11. NOTE: When usingrust-bert
as dependency from crates.io, please check the requiredLIBTORCH
on the published package readme as it may differ from the version documented here (applying to the current repository version). -
Extract the library to a location of your choice
-
Set the following environment variables
export LIBTORCH=/path/to/libtorch
export LD_LIBRARY_PATH=${LIBTORCH}/lib:$LD_LIBRARY_PATH
brew install pytorch jq
export LIBTORCH=$(brew --cellar pytorch)/$(brew info --json pytorch | jq -r '.[0].installed[0].version')
export LD_LIBRARY_PATH=${LIBTORCH}/lib:$LD_LIBRARY_PATH