Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Support xinference rerank model #1455

Closed
1 task done
hwzhuhao opened this issue Jul 10, 2024 · 10 comments
Closed
1 task done

[Feature Request]: Support xinference rerank model #1455

hwzhuhao opened this issue Jul 10, 2024 · 10 comments
Labels

Comments

@hwzhuhao
Copy link
Contributor

Is there an existing issue for the same feature request?

  • I have checked the existing issues.

Is your feature request related to a problem?

No response

Describe the feature you'd like

support xinference rerank model

Describe implementation you've considered

1.Create a class XInferenceRerank that inherits from Base and overrides the similarity method.
2.Add a tag to XInference during data initialization with the value TEXT RE-RANK.
3.Modify the frontend code in ollama-modal/index.tsx to add an option for rerank.

Documentation, adoption, use case

No response

Additional information

No response

@KevinHuSh KevinHuSh mentioned this issue Jul 10, 2024
27 tasks
KevinHuSh pushed a commit that referenced this issue Jul 11, 2024
### What problem does this PR solve?

support xinference rerank model
#1455 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
@jay-jjwu
Copy link

whis is supposed to be the url for rerank model of xinference? i got an error when i use the base url "http://:9997/v1", any ideas?

@hwzhuhao
Copy link
Contributor Author

root@pc-gpu-86-41:~# curl -X 'POST' 'http://127.0.0.1:9997/v1/rerank' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{
"model": "bge-reranker-v2-m3",
"query": "A man is eating pasta.",
"return_documents":"true",
"return_len":"true",
"documents": [
"A man is eating food.",
"A man is eating a piece of bread.",
"The girl is carrying a baby.",
"A man is riding a horse.",
"A woman is playing violin."
]
}'
{"id":"610a8724-3e96-11ef-81ce-08bfb886c012","results":[{"index":0,"relevance_score":0.999574601650238,"document":{"text":"A man is eating food."}},{"index":1,"relevance_score":0.07814773917198181,"document":{"text":"A man is eating a piece of bread."}},{"index":3,"relevance_score":0.000017700713215162978,"document":{"text":"A man is riding a horse."}},{"index":2,"relevance_score":0.0000163753629749408,"document":{"text":"The girl is carrying a baby."}},{"index":4,"relevance_score":0.00001631895975151565,"document":{"text":"A woman is playing violin."}}],"meta":{"api_version":null,"billed_units":null,"tokens":{"input_tokens":38,"output_tokens":38},"warnings":null}}

@jay-jjwu
Copy link

the xinfernece works well (as above your scripts showed), and i also did test via curl and postman, all good. But when i try to add model in offical ragflow web-page, got errors. see below screenshot.
btw, i map the port 9997 to public 4419 via router and expose to internet to test on ragflow official site.
The url works good and tested in postman, but always failed from the ragflow web site.
is the below config correct?
image

@hwzhuhao
Copy link
Contributor Author

@jay-jjwu
Copy link

tried again, see error below:
image

@imaben
Copy link

imaben commented Jul 15, 2024

+1

@hwzhuhao
Copy link
Contributor Author

I think I know where the problem is. It occurred due to an error in the token calculation. It should be res["meta"]["tokens"]["input_tokens"] + res["meta"]["tokens"]["output_tokens"] instead of res["tokens"]["input_tokens"] + res["tokens"]["output_tokens"]

@jay-jjwu
Copy link

do you mean the codes here?
i dont have a local delpoyment, meanwhile i cannot test it.
would you test and pull a request?
many thanks!

image

@hwzhuhao
Copy link
Contributor Author

#1527

@jay-jjwu
Copy link

thanks!!

@yingfeng yingfeng mentioned this issue Aug 6, 2024
59 tasks
Halfknow pushed a commit to Halfknow/ragflow that referenced this issue Nov 11, 2024
### What problem does this PR solve?

support xinference rerank model
infiniflow#1455 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants