You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
…pp rerank support (infiniflow#2906)
### What problem does this PR solve?
Resolveinfiniflow#2905
due to the in-consistent of token size, I make it safe to limit 500 in
code, since there is no config param to control
my llama.cpp run set -ub to 1024:
${llama_path}/bin/llama-server --host 0.0.0.0 --port 9901 -ub 1024 -ngl
99 -m $gguf_file --reranking "$@"
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Here is my test Ragflow use llama.cpp
```
lot update_slots: id 0 | task 458 | prompt done, n_past = 416, n_tokens = 416
slot release: id 0 | task 458 | stop processing: n_past = 416, truncated = 0
slot launch_slot_: id 0 | task 459 | processing task
slot update_slots: id 0 | task 459 | tokenizing prompt, len = 2
slot update_slots: id 0 | task 459 | prompt tokenized, n_ctx_slot = 8192, n_keep = 0, n_prompt_tokens = 111
slot update_slots: id 0 | task 459 | kv cache rm [0, end)
slot update_slots: id 0 | task 459 | prompt processing progress, n_past = 111, n_tokens = 111, progress = 1.000000
slot update_slots: id 0 | task 459 | prompt done, n_past = 111, n_tokens = 111
slot release: id 0 | task 459 | stop processing: n_past = 111, truncated = 0
srv update_slots: all slots are idle
request: POST /rerank 172.23.0.4 200
```
Is there an existing issue for the same feature request?
Is your feature request related to a problem?
Recent Llama.cpp support rerank , I want to support it throught OpenAI compatible provider https://github.com/ggerganov/llama.cpp/pull/9510
Describe the feature you'd like
make RAGflow application use reranker from llama.cpp
Describe implementation you've considered
LLama.cpp support jina API stand , though it seems no standard in score.
There is no openai rerank standard yet, but jina is popular and choose by Llama.cpp
see llama.cpp issues 9510
I will provide my implementation in PR
Documentation, adoption, use case
No response
Additional information
I have test and success full works in Ragflow
The text was updated successfully, but these errors were encountered: