-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add embedding models configurable, from both transformers.js and TEI #646
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for the contribution! Looks like a great PR, I like the way you handled the embeddings coming from different sources, should be quite easy to extend in the future.
I was wondering why you chose to specify the embedding at the model level ?
For most use cases, I imagine that specifying a single embedding config for the entire app should be enough no? Not 100% sure on this, maybe there are use cases where you need to support multiple embeddings, let me know!
Hi, thanks for the quick response, |
Of course, then it would make a lot of sense to have different embeddings, I didn't think about those use cases 😁 |
Updated with automatic batching for TEI endpoints, using /info route to calculate max batch size. Fixed a bug: when using Web Search and the llm finish outputing the web search box and the sources disappeared, and got visible again when refreshing the page. |
Now supports models with necessary prefixes, such as https://huggingface.co/intfloat/multilingual-e5-large#faq, with |
Another small comment is: instead of calling |
indeed. In #641, I've refactored embeddings functionality. Specifically, refactored We plan to merge your PR first and the merge #641 afterwards. Therefore, please feel free to update embeddings functionality in this PR to match that of #641 |
Overall, great work ! Looking forward to merging it 🚀 |
Hi @mishig25, thanks for the review, fixed it. |
LGTM (great job) ! Asking a review from @nsarrazin |
@nsarrazin maybe you can pay more attention to embedding endpoints files: types/EmbeddingEndpoints.ts, transformersjs/embeddingEndpoints.ts, tei/embeddingEndpoints.ts as they resemble the structure a lot of #541 |
@mikelfried @nsarrazin actually I think we need one more config/setting:
Therefore, we would need a setting to specify by task as well. Wdyt? |
I can see the use case, but not sure what would be the best way to structure it in the config file. Maybe we could have three vars inside of the model:
|
this option sounds good to me. I can do it in subseq PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from my side, this PR, LGTM !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is working great, quite happy with it. I tested locally with the huggingchat config and it works out of the box. Thanks for the great contribution @mikelfried and I think we can merge this now!
We can add the extra vars in the PDF PR @mishig25, thanks for the detailed review as well!
one sec, don't merge yet |
Ready to merge ! @nsarrazin @mikelfried great job again ! |
Testing one last time that it doesn't break huggingchat and will merge 😄 |
Works well, merging it! thanks again for an awesome contrib @mikelfried |
…uggingface#646) * Add embedding models configurable, from both Xenova and TEI * fix lint and format * Fix bug in sentenceSimilarity * Batches for TEI using /info route * Fix web search disapear when finish searching * Fix lint and format * Add more options for better embedding model usage * Fixing CR issues * Fix websearch disapear in later PR * Fix lint * Fix more minor code CR * Valiadate embeddingModelName field in model config * Add embeddingModel into shared conversation * Fix lint and format * Add default embedding model, and more readme explanation * Fix minor embedding model readme detailed * Update settings.json * Update README.md Co-authored-by: Mishig <[email protected]> * Update README.md Co-authored-by: Mishig <[email protected]> * Apply suggestions from code review Co-authored-by: Mishig <[email protected]> * Resolved more issues * lint * Fix more issues * Fix format * fix small typo * lint * fix default model * Rn `maxSequenceLength` -> `chunkCharLength` * format * add "authorization" example * format --------- Co-authored-by: Mishig <[email protected]> Co-authored-by: Nathan Sarrazin <[email protected]> Co-authored-by: Mishig Davaadorj <[email protected]>
Embedding models configurable, from both Xenova and TEI
Enable to configure the embedding model for the websearch. it allows running the model both locally and on hosted machines utilizing GPUs.
Work in progress
@mishig25 in #641 PR changes the current embedding model code, so I will need to tweak the code according to his changes. This feature will be helpfull to get faster and larger embedding models for the pdf chunks (when tei hosted on gpu), and can be language specific.
current limitations
when using TEI, there is env varaibles thereNow taking MAX_CLIENT_BATCH_SIZE and MAX_BATCH_TOKENS of TEI into account.MAX_CLIENT_BATCH_SIZE
,MAX_BATCH_TOKENS
that limits the size of batch, which I need to take them into account. (I should make/info
request to the endpoint first and then send batches accordingly)Status
Currently works, will fix the limitations soon.
please share your opinion 🙂