The TextEncoder converts text inputs to embeddings. To adapt LangChain agent and vector store, this module must be a LangChain `Embeddings`. By default, it uses a modified `HuggingFaceEmbeddings` in LangChain. The modified HuggingFace encoder allows to return normalized embedding(s) by the parameter `norm`. If you want to configure text encoder, refer to [Configuration](https://github.com/zilliztech/akcio/wiki/Configuration#embedding). ## Usage Example ```python from text_encoder import TextEncoder encoder = TextEncoder() # Generate embeddings for a list of documents doc_embeddings = encoder.embed_documents(['test']) # Generate embedding for a text input query_embedding = encoder.embed_query('test') ``` ## Customization To change embedding method used in operations, you can modify [__init__.py](https://github.com/zilliztech/akcio/blob/main/src_langchain/embedding/__init__.py) to import a different TextEncoder. For example, changing `.langchain_huggingface` to `.openai_embedding` in the init file will switch to OpenAI embedding. In the meantime, don't forget to modify [config.py](https://github.com/zilliztech/akcio/blob/main/src_langchain/embedding/config.py) for a changed embedding method. The text encoder should inherit LangChain Embeddings and you can rewrite the methods `embed_documents` and `embed_query` to customize the module: ```python from langchain.embeddings.base import Embeddings def test_encoder(text: str): return [0., 0.] class TextEncoder(Embeddings): def embed_documents(self, texts: List[str]) -> List[List[float]]: """Embed search docs.""" outputs = [] for t in texts: embed = test_encoder(t) outputs.append(embed) return outputs def embed_query(self, text: str) -> List[float]: """Embed query text.""" return test_encoder(text) ```