-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Concurrent OpenAI Embeddings #8
feat: Concurrent OpenAI Embeddings #8
Conversation
Hi @Butch78! Thank you for opening the PR and for diving into concurrency to improve the speed of generating embeddings! I'm glad the PDF example was helpful for your use case. I've previously looked into concurrent processing for the Bert embeddings using the Rayon crate. One challenge I faced was ensuring the order of the output embeddings ( Did your implementation successfully maintain the correct order? I'd be interested to see how you tackled this challenge. Additionally, you might find Looking forward to your thoughts! |
My Embeddings implementation lacks order preservation, primarily due to my initial oversight of its importance. I did some further research and we could a Queue system which would solve this issue #9, One possible solution it creating something similar to the recently released Hugging Face repository for text embeddings inference. This will hopefully ensure that order is consistently maintained while allowing some good performance! I'll change my pull request to use |
async tasks.
performance.
I updated my code to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I really like your solution! It's easy to understand yet effective. I just left some suggestions to make it cleaner (I think) using Tokio. I like to avoid adding extra dependencies unless absolutely necessary (I am looking at ways to reduce the number of dependencies we have right now). Also, if you think any of my suggestions are wrong or there is a better way to do it, let me know.
orca/src/llm/openai.rs
Outdated
tokio::spawn(async move { | ||
let res = client.execute(req).await.map_err(|e| e.to_string())?; | ||
let response = res.json::<OpenAIEmbeddingResponse>().await.map_err(|e| e.to_string())?; | ||
sender.send((i, response)).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could look something like sender.send((i, response)).await;
using tokio channels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: Actually, the current method of error handling within the spawned tasks can be problematic. It might be better to send back a Result type over the channel, so the receiver can decide how to handle errors.
tokio::spawn(async move {
let result = async {
let res = client.execute(req).await.map_err(|e| EmbeddingError::RequestError(e.to_string()))?;
let response = res.json::<OpenAIEmbeddingResponse>().await.map_err(|e| EmbeddingError::ResponseError(e.to_string()))?;
Ok(response)
}.await;
// Send back the result (success or error) via the channel.
sender.send((i, result)).await.expect("Failed to send over channel");
});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just added simple format!
errors but I was wondering did you want to create some thiserror
for the EmbeddingError like in your example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
format!
should be fine for now! I just made up some generic type there for example purposes. Tbh Rust error handling still confuses me, so I just try to use anyhow
for everything. In the future my plan is to clean up Orca's error handling.
@Butch78 huh strange. They must have removed the Clone implementation for |
It's also very strange to me how the CI workflow keeps failing on PRs but it passes on pushes to main branch... |
Hi @Butch78, I made a small change to drop the
Since there was still one sender "alive", the receiver loop did not exit. I just added a simple |
Amazing work! Once I ran |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving!
Hi @santiagomed,
Thank you for creating the PDF example it helped so much with what I want to do!
I think my laptop is a little under-powered because it is taking forever to make the Bert Embeddings 😅
So I had a go at creating a concurrent request for the OpenAI Embeddings, it worked much faster when I tested the rev on my local PDFs.