-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minor Difference in Embeddings Between Batch and Single Sentence Encoding #3109
Comments
Hello! There's a few others who found similar things (#2312, #2451). You can read through those as well, but in short: we believe this originates in I could imagine that lower precision (e.g. See also related issues for just
|
Hi Tom Aarsen, Thank you so much for the detailed explanation! As this is my first time working with models like SentenceTransformers, your insights really helped me understand the subtle differences and the reasons behind them. It’s great to know that these minor variations are expected and generally won’t impact downstream tasks. I truly appreciate the fantastic work you and the team have done on this library. It’s been a pleasure exploring its features, and I’m looking forward to learning more and contributing in the future! Best regards |
Hi,
I noticed a minor difference in the embeddings when using batch encoding versus encoding sentences one by one and then merging the results.
Here are the details:
1. Model:
SentenceTransformer('jina-embeddings-v3')
2. Batch encoding:
embeddings = model.encode(sentence_list)
3. Single encoding:
Using a loop, embedding = model.encode(single_sentence)
for each sentence and then combining them.When comparing the embeddings of the first sentence from both methods, the cosine similarity was 0.99996984, showing a very slight difference.
Could you please clarify if this is expected behavior? If so, is it due to internal optimizations, precision differences, or something else?
Thanks for your great work on this library!
The text was updated successfully, but these errors were encountered: