-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LlamaCppEmbeddings does not work #3626
Comments
🤖 Hello @lanluu! 🖐️ I'm Dosu, a friendly bot here to help you out while we wait for a human maintainer. I specialize in solving bugs, answering questions, and even guiding you on your journey to becoming a contributor. Remember, I'm here to assist you. Let's work together to sort this out! 😊 The vector length appears to be randomized because the In the Here is the relevant code: const encodings = await this.caller.call(
() =>
new Promise((resolve) => {
resolve(this._context.encode(text));
})
);
for (let i = 0; i < encodings.length; i += 1) {
const token: number = +encodings[i];
tokens.push(token);
} In this code, If you need more control over the length of the resulting vector, you might consider implementing a padding mechanism to ensure a consistent vector length, or truncating the vector to a fixed length. However, please note that both of these approaches may impact the quality of the embeddings. For more details, you can refer to the source code of the I hope this helps! If you have any more questions, feel free to ask. Best regards, SourcesThis response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
CC @nigel-daniels any ideas here? |
@lanluu do you have some more context on what you are trying to do and how the failure manifests? |
I too can confirm this issue. The LlamaCppEmbeddings is generating random length encodings which in turn is giving below error when inserting into any DB:
|
@AnirudhaGohokar the length of the token array will be approximately relative to the length of the response, therefore this will result in a variable length array. The error you shared looks like the error is coming from a breaking table constraint in ClickHouse. Can you share the code you are using as I don't see why the resulting embedding array would map to a database constraint? |
Hi, I also can confirm this issue. It happens with Faiss and HNSWLib. The code to reproduce is basically the example from the doc-files: import { FaissStore } from "langchain/vectorstores/faiss";
import { LlamaCppEmbeddings } from "langchain/embeddings/llama_cpp";
const llamaPath = "models/mistral-7b-v0.1.Q5_K_M.gguf";
const embedding = new LlamaCppEmbeddings({
modelPath: llamaPath,
embedding: true,
});
const vectorStore = await FaissStore.fromTexts(
["Hello world", "Bye bye", "hello nice world"],
[{ id: 2 }, { id: 1 }, { id: 3 }],
embedding
);
const resultOne = await vectorStore.similaritySearch("hello world", 1); which gives: For Faiss:
or when changing to HNSWLib:
|
Ok got it so the error is the dimension size not the specific array length. |
I think #3509 is related to this. When changing the order of the text (moving "Bye bye" to the first position) HNSWLib returns
HNSWLib internally sets this.args.numDimensions to the length of the embeddings of the first element of the passed elements. async initIndex(vectors) {
if (!this._index) {
if (this.args.numDimensions === undefined) {
this.args.numDimensions = vectors[0].length;
} therefore this never fails for the first element, but in this case fails for the second with an array-lenght of 2. I think other vectorStores work very similar, for example in Faiss: const dv = vectors[0].length;
if (!this._index) {
const { IndexFlatL2 } = await FaissStore.importFaiss();
this._index = new IndexFlatL2(dv);
}
const d = this.index.getDimension();
if (dv !== d) {
throw new Error(`Vectors must have the same length as the number of dimensions (${d})`);
} Therefore I assume that the vectors/embeddings have to be of equal length for this to work? But that is just my wild guess. This article supports my theory that every vector/embedding has to have the same length (MongoDB article). If that is indeed the case, the issue seems to be in node-llama-cpp, because in the following code-example (using node-llama-cpp directly), the val1 val2 and val3 are arrays of unequal length, whereas they should have been arrays of equal lenght - the dimension of the model? import { LlamaModel, LlamaContext, LlamaChatSession } from "node-llama-cpp";
const llamaPath = "models/mistral-7b-v0.1.Q5_K_M.gguf"; //english
const model = new LlamaModel({
modelPath: llamaPath,
});
const context = new LlamaContext({ model });
const val1 = context.encode("hello nice world");
const val2 = context.encode("Hello world");
const val3 = context.encode("Bye bye"); |
@awijshoff correct I am just passing the results from |
I haven't added support for embedding in The current implementation for this on Langhcain is wrong and won't work. I've opened an issue for the embedding feature on |
@giladgd thanks for the update, I'll await the |
@awijshoff Thanks for detailing this. The issue I created is exactly what you've described and am very excited about the update coming down the pipe for it. :) |
this is not a bug. the context
in the case of the Xenova model above, you will always get a vector of dimension 384. then for each of your document chunks being stored in a vector db, you will use the same embed call. that way vectors are comparable. the vector db can take it from there. |
@giladgd Hi, we are so excited to be able to work with |
@lanluu I predict it'll take about ~1 month to finish the major features that contain breaking changes for the v3 beta, so the stable release would come a bit after that |
I've added support for embedding in the beta version of Note that since this is a beta version, the API may change before the stable release. |
@giladgd that's great, thanks. Looking forward to the 3.0.0 release :) |
I used node-llama-cpp (3.0.0-beta.3) with langchain(2.8.8), but still got TypeError: this._context.encode is not a function |
I also used node-llama-cpp (3.0.0-bet.14) earlier, but found a new error: file:///Users/panjunlin/work/github/chat-mind/node_modules/node-llama-cpp/dist/evaluator/LlamaModel.js:22 TypeError: Cannot destructure property '_llama' of 'undefined' as it is undefined. |
Ran into this bug as well, solution was to use |
Currently, we are employing llamacpp via langchain as documented at https://js.langchain.com/docs/integrations/llms/llama_cpp.
(version:
0.0.204
)It looks like that the embeddings via LlamaCppEmbeddings class (located at: langchain/src/embeddings/llama_cpp.ts) still do not work for us, as the resulted vector's length is quite randomized, for instance:
[
1824, 349, 418,
274, 1402, 28709,
334, 8236, 28828,
28804
]
It is noted that, the different gguf models such as llama-2 or zephyr-ß result the same thing.
Your tips will help us in this case much.
Thank you in advance!
The text was updated successfully, but these errors were encountered: