Minor Difference in Embeddings Between Batch and Single Sentence Encoding #3109

XYZliang · 2024-12-03T03:00:05Z

Hi,

I noticed a minor difference in the embeddings when using batch encoding versus encoding sentences one by one and then merging the results.

Here are the details:
1. Model: SentenceTransformer('jina-embeddings-v3')
2. Batch encoding: embeddings = model.encode(sentence_list)
3. Single encoding: Using a loop, embedding = model.encode(single_sentence) for each sentence and then combining them.

When comparing the embeddings of the first sentence from both methods, the cosine similarity was 0.99996984, showing a very slight difference.

Could you please clarify if this is expected behavior? If so, is it due to internal optimizations, precision differences, or something else?

Thanks for your great work on this library!

The text was updated successfully, but these errors were encountered:

tomaarsen · 2024-12-03T10:33:01Z

Hello!

There's a few others who found similar things (#2312, #2451). You can read through those as well, but in short: we believe this originates in torch or even lower-level than that, rather than sentence-transformers or transformers. I would not be surprised if this is based on hardware optimizations or approximations automatically done for efficiency.

I could imagine that lower precision (e.g. model.bfloat16() or model.half()) could make this a bit worse even, but I don't think that this will ever notably impact downstream performance like retrieval, classification, clustering, etc.

See also related issues for just torch:

Tom Aarsen

XYZliang · 2024-12-04T03:50:21Z

Hello! 你好

There's a few others who found similar things (#2312, #2451). You can read through those as well, but in short: we believe this originates in torch or even lower-level than that, rather than sentence-transformers or transformers. I would not be surprised if this is based on hardware optimizations or approximations automatically done for efficiency.还有一些人也发现了类似的问题（ #2312 、 #2451 ）。如果这是基于硬件优化或为提高效率而自动进行的近似，我也不会感到惊讶。

I could imagine that lower precision (e.g. model.bfloat16() or model.half()) could make this a bit worse even, but I don't think that this will ever notably impact downstream performance like retrieval, classification, clustering, etc.我可以想象，较低的精度（如 model.bfloat16() 或 model.half() ）甚至会使情况变得更糟，但我认为这不会对检索、分类、聚类等下游性能产生显著影响。

See also related issues for just torch:另请参阅相关问题，只需 torch ：

https://stackoverflow.com/questions/61292150/breaking-down-a-batch-in-pytorch-leads-to-different-results-why

https://discuss.pytorch.org/t/gru-different-results-between-batched-and-one-by-one-inference/110255/5

Tom Aarsen 汤姆-阿森

Hi Tom Aarsen,

Thank you so much for the detailed explanation! As this is my first time working with models like SentenceTransformers, your insights really helped me understand the subtle differences and the reasons behind them. It’s great to know that these minor variations are expected and generally won’t impact downstream tasks.

I truly appreciate the fantastic work you and the team have done on this library. It’s been a pleasure exploring its features, and I’m looking forward to learning more and contributing in the future!

Best regards

XYZliang closed this as completed Dec 4, 2024

tomaarsen mentioned this issue Jan 29, 2025

Inconsistent encodings with different batch sizes #3197

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minor Difference in Embeddings Between Batch and Single Sentence Encoding #3109

Minor Difference in Embeddings Between Batch and Single Sentence Encoding #3109

XYZliang commented Dec 3, 2024

tomaarsen commented Dec 3, 2024

XYZliang commented Dec 4, 2024

Minor Difference in Embeddings Between Batch and Single Sentence Encoding #3109

Minor Difference in Embeddings Between Batch and Single Sentence Encoding #3109

Comments

XYZliang commented Dec 3, 2024

tomaarsen commented Dec 3, 2024

XYZliang commented Dec 4, 2024