You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 24, 2024. It is now read-only.
As pointed out in #291, the quality of embeddings produced by the models at present appears to be suboptimal.
Our current approach uses the embedding of the final token as a representation for the entire input sequence, which might lead to the omission of some semantic information. The approach employed by SGPT: GPT Sentence Embeddings for Semantic Search offers an alternative: they use a weighted mean sampling method to amalgamate the embeddings of all tokens in the input sequence. According to the MTEB-Benchmark, this method results in superior embeddings.
So, this poses the question: should we integrate this method into our implementation? Alternatively, should we leave it to users to manually extract the embeddings for each token and carry out the calculations themselves?
The text was updated successfully, but these errors were encountered:
Good catch! I think we should integrate this, but separate it from the existing embeddings. I'm also not sure how we best expose this. Any ideas for API changes that are understandable and restricted to only where it makes sense? This would only make sense with feed_prompt, right?
As pointed out in #291, the quality of embeddings produced by the models at present appears to be suboptimal.
Our current approach uses the embedding of the final token as a representation for the entire input sequence, which might lead to the omission of some semantic information. The approach employed by SGPT: GPT Sentence Embeddings for Semantic Search offers an alternative: they use a weighted mean sampling method to amalgamate the embeddings of all tokens in the input sequence. According to the MTEB-Benchmark, this method results in superior embeddings.
So, this poses the question: should we integrate this method into our implementation? Alternatively, should we leave it to users to manually extract the embeddings for each token and carry out the calculations themselves?
The text was updated successfully, but these errors were encountered: