Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Proposal to Allow Dynamic Use of Additional Embedding Models #1060

Closed
e7217 opened this issue Dec 18, 2024 · 2 comments · Fixed by #1063
Closed

[Feature Request] Proposal to Allow Dynamic Use of Additional Embedding Models #1060

e7217 opened this issue Dec 18, 2024 · 2 comments · Fixed by #1063
Labels
enhancement New feature or request

Comments

@e7217
Copy link
Contributor

e7217 commented Dec 18, 2024

Is your feature request related to a problem? Please describe.
Hello!

How about improving the system to allow the use of other embedding models dynamically, in addition to the predefined models?

self.embedding = embedding_models[embedding_model]()

embedding_models = {
# llama index
"openai": LazyInit(
OpenAIEmbedding
), # default model is OpenAIEmbeddingModelType.TEXT_EMBED_ADA_002
"openai_embed_3_large": LazyInit(
OpenAIEmbedding, model_name=OpenAIEmbeddingModelType.TEXT_EMBED_3_LARGE
),
"openai_embed_3_small": LazyInit(
OpenAIEmbedding, model_name=OpenAIEmbeddingModelType.TEXT_EMBED_3_SMALL
),
"mock": LazyInit(MockEmbeddingRandom, embed_dim=768),
# langchain
"openai_langchain": LazyInit(OpenAIEmbeddings),
}
try:
# you can use your own model in this way.
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
embedding_models["huggingface_baai_bge_small"] = LazyInit(
HuggingFaceEmbedding, model_name="BAAI/bge-small-en-v1.5"
)
embedding_models["huggingface_cointegrated_rubert_tiny2"] = LazyInit(
HuggingFaceEmbedding, model_name="cointegrated/rubert-tiny2"
)
embedding_models["huggingface_all_mpnet_base_v2"] = LazyInit(
HuggingFaceEmbedding,
model_name="sentence-transformers/all-mpnet-base-v2",
max_length=512,
)
embedding_models["huggingface_bge_m3"] = LazyInit(
HuggingFaceEmbedding, model_name="BAAI/bge-m3"
)
except ImportError:
logger.info(
"You are using API version of AutoRAG."
"To use local version, run pip install 'AutoRAG[gpu]'"
)

@vkehfdl1
Copy link
Contributor

@e7217 Do you have any idea?
If you suggest any idea to config the embedding model, I think we can talk about how to make it.

@e7217
Copy link
Contributor Author

e7217 commented Dec 19, 2024

@vkehfdl1 Thank you for your reply.
I will also consider a good solution for this.

@vkehfdl1 vkehfdl1 linked a pull request Jan 4, 2025 that will close this issue
vkehfdl1 added a commit that referenced this issue Jan 4, 2025
* feat: dynamic embed init

* feat: apply embeddingbase class to vectordb

* feat: add tests and optimize codes

* fix: update import path of embedding_models from autorag to autorag.embedding.base

* fix: import err

* fix: autorag.embedding_models -> autorag.embedding.base.embedding_models

* simplify and delete _check_one_item function and use assertion

* return LazyInit at load_from_str and use initialized embedding model at vectordb class

* add docs for newer embedding model configuration

---------

Co-authored-by: Um Changyong <[email protected]>
Co-authored-by: Jeffrey (Dongkyu) Kim <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants