-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
longchat-13b-16k chat not work #14
Comments
@ahkimkoo it has not been trained in Chinese data, please use only English for now. |
Thank you for your reply, but even if you use English, it can't reply normally |
Can you give a screen shot on how you are loading the model, and what inputs you give? |
Because its not patched. |
@DachengLi1 I would like to follow up on this. I'm having the same issue, running the same model using fastchat openai-server implementation. Getting the same outputs (some times some "A A A A A A A A A A A A" screaming) while running the latest version with the monkey patch applied. Here are the requests I send to the endpoint and relative output: curl http://localhost:8100/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer fattiicazzituoi" \
-d '{
"model": "longchat-13b-16k",
"messages": [{"role": "user", "content": "Say this is a test."}],
"temperature": 0.3, "max_tokens": 200
}'
But if I use the "completions" (non-chat) endpoint the model works "correctly" (or at least it does not scream at me): curl http://localhost:8100/v1/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer aaaaaaaaaaaa" \
-d '{
"model": "lmsys/longchat-13b-16k",
"prompt": "Say this is a test.",
"max_tokens": 20,
"temperature": 0.5
}'
|
@scuty2000 Fun image lol. |
@DachengLi1 I don't know if this can help, but I suspect is related to the int8 quantization. Using the 7B version not quantized works pretty well. |
@scuty2000 Yes, I also heard it elsewhere. |
Any update on this? |
reply like this:
The text was updated successfully, but these errors were encountered: