Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: add rerank support to llama.cpp rerank #2905

Closed
1 task done
ziyu4huang opened this issue Oct 20, 2024 · 0 comments · Fixed by #2906
Closed
1 task done

[Feature Request]: add rerank support to llama.cpp rerank #2905

ziyu4huang opened this issue Oct 20, 2024 · 0 comments · Fixed by #2906
Labels

Comments

@ziyu4huang
Copy link
Contributor

Is there an existing issue for the same feature request?

  • I have checked the existing issues.

Is your feature request related to a problem?

Recent Llama.cpp support rerank , I want to support it throught OpenAI compatible provider
https://github.com/ggerganov/llama.cpp/pull/9510

Describe the feature you'd like

make RAGflow application use reranker from llama.cpp

Describe implementation you've considered

LLama.cpp support jina API stand , though it seems no standard in score.

There is no openai rerank standard yet, but jina is popular and choose by Llama.cpp

see llama.cpp issues 9510

I will provide my implementation in PR

Documentation, adoption, use case

No response

Additional information

I have test and success full works in Ragflow

ziyu4huang added a commit to ziyu4huang/ragflow that referenced this issue Oct 20, 2024
Halfknow pushed a commit to Halfknow/ragflow that referenced this issue Nov 11, 2024
…pp rerank support (infiniflow#2906)

### What problem does this PR solve?
Resolve infiniflow#2905 



due to the in-consistent of token size, I make it safe to limit 500 in
code, since there is no config param to control

my llama.cpp run set -ub to 1024:

${llama_path}/bin/llama-server --host 0.0.0.0 --port 9901 -ub 1024 -ngl
99 -m $gguf_file --reranking "$@"





### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Here is my test Ragflow use llama.cpp

```
lot update_slots: id  0 | task 458 | prompt done, n_past = 416, n_tokens = 416
slot      release: id  0 | task 458 | stop processing: n_past = 416, truncated = 0
slot launch_slot_: id  0 | task 459 | processing task
slot update_slots: id  0 | task 459 | tokenizing prompt, len = 2
slot update_slots: id  0 | task 459 | prompt tokenized, n_ctx_slot = 8192, n_keep = 0, n_prompt_tokens = 111
slot update_slots: id  0 | task 459 | kv cache rm [0, end)
slot update_slots: id  0 | task 459 | prompt processing progress, n_past = 111, n_tokens = 111, progress = 1.000000
slot update_slots: id  0 | task 459 | prompt done, n_past = 111, n_tokens = 111
slot      release: id  0 | task 459 | stop processing: n_past = 111, truncated = 0
srv  update_slots: all slots are idle
request: POST /rerank 172.23.0.4 200

```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants