-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request]: Support xinference rerank model #1455
Comments
### What problem does this PR solve? support xinference rerank model #1455 ### Type of change - [x] New Feature (non-breaking change which adds functionality)
whis is supposed to be the url for rerank model of xinference? i got an error when i use the base url "http://:9997/v1", any ideas? |
root@pc-gpu-86-41:~# curl -X 'POST' 'http://127.0.0.1:9997/v1/rerank' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{ |
the base url should be http://domain-name:4419/v1/rerank,not http://domain-name:4419/v1 |
+1 |
I think I know where the problem is. It occurred due to an error in the token calculation. It should be res["meta"]["tokens"]["input_tokens"] + res["meta"]["tokens"]["output_tokens"] instead of res["tokens"]["input_tokens"] + res["tokens"]["output_tokens"] |
thanks!! |
### What problem does this PR solve? support xinference rerank model infiniflow#1455 ### Type of change - [x] New Feature (non-breaking change which adds functionality)
Is there an existing issue for the same feature request?
Is your feature request related to a problem?
No response
Describe the feature you'd like
support xinference rerank model
Describe implementation you've considered
1.Create a class XInferenceRerank that inherits from Base and overrides the similarity method.
2.Add a tag to XInference during data initialization with the value TEXT RE-RANK.
3.Modify the frontend code in ollama-modal/index.tsx to add an option for rerank.
Documentation, adoption, use case
No response
Additional information
No response
The text was updated successfully, but these errors were encountered: