[Feature Request]: add rerank support to llama.cpp rerank #2905

ziyu4huang · 2024-10-20T03:18:32Z

Is there an existing issue for the same feature request?

I have checked the existing issues.

Is your feature request related to a problem?

Recent Llama.cpp support rerank , I want to support it throught OpenAI compatible provider
https://github.com/ggerganov/llama.cpp/pull/9510

Describe the feature you'd like

make RAGflow application use reranker from llama.cpp

Describe implementation you've considered

LLama.cpp support jina API stand , though it seems no standard in score.

There is no openai rerank standard yet, but jina is popular and choose by Llama.cpp

see llama.cpp issues 9510

I will provide my implementation in PR

Documentation, adoption, use case

No response

Additional information

I have test and success full works in Ragflow

…pp rerank support (infiniflow#2906) ### What problem does this PR solve? Resolve infiniflow#2905 due to the in-consistent of token size, I make it safe to limit 500 in code, since there is no config param to control my llama.cpp run set -ub to 1024: ${llama_path}/bin/llama-server --host 0.0.0.0 --port 9901 -ub 1024 -ngl 99 -m $gguf_file --reranking "$@" ### Type of change - [x] New Feature (non-breaking change which adds functionality) Here is my test Ragflow use llama.cpp ``` lot update_slots: id 0 | task 458 | prompt done, n_past = 416, n_tokens = 416 slot release: id 0 | task 458 | stop processing: n_past = 416, truncated = 0 slot launch_slot_: id 0 | task 459 | processing task slot update_slots: id 0 | task 459 | tokenizing prompt, len = 2 slot update_slots: id 0 | task 459 | prompt tokenized, n_ctx_slot = 8192, n_keep = 0, n_prompt_tokens = 111 slot update_slots: id 0 | task 459 | kv cache rm [0, end) slot update_slots: id 0 | task 459 | prompt processing progress, n_past = 111, n_tokens = 111, progress = 1.000000 slot update_slots: id 0 | task 459 | prompt done, n_past = 111, n_tokens = 111 slot release: id 0 | task 459 | stop processing: n_past = 111, truncated = 0 srv update_slots: all slots are idle request: POST /rerank 172.23.0.4 200 ```

ziyu4huang added a commit to ziyu4huang/ragflow that referenced this issue Oct 20, 2024

Resolves infiniflow#2905

77e57c6

ziyu4huang mentioned this issue Oct 20, 2024

Resolves #2905 openai compatible model provider add llama.cpp rerank support #2906

Merged

1 task

KevinHuSh added the Feature label Oct 21, 2024

KevinHuSh closed this as completed in #2906 Oct 21, 2024

KevinHuSh closed this as completed in e5f7733 Oct 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request]: add rerank support to llama.cpp rerank #2905

[Feature Request]: add rerank support to llama.cpp rerank #2905

ziyu4huang commented Oct 20, 2024

[Feature Request]: add rerank support to llama.cpp rerank #2905

[Feature Request]: add rerank support to llama.cpp rerank #2905

Comments

ziyu4huang commented Oct 20, 2024

Is there an existing issue for the same feature request?

Is your feature request related to a problem?

Describe the feature you'd like

Describe implementation you've considered

Documentation, adoption, use case

Additional information