longchat-13b-16k chat not work #14

ahkimkoo · 2023-07-05T12:45:31Z

reply like this:

DachengLi1 · 2023-07-05T15:08:08Z

@ahkimkoo it has not been trained in Chinese data, please use only English for now.

ahkimkoo · 2023-07-06T04:00:01Z

@ahkimkoo it has not been trained in Chinese data, please use only English for now.

Thank you for your reply, but even if you use English, it can't reply normally

DachengLi1 · 2023-07-06T04:17:01Z

Can you give a screen shot on how you are loading the model, and what inputs you give?

musabgultekin · 2023-07-06T21:15:25Z

Because its not patched.
See here how to do that:
https://github.com/lm-sys/FastChat/blob/0a827abe0cc60a3733b4406a070beb1ac8d0e5e1/fastchat/model/model_adapter.py#L445

scuty2000 · 2023-07-13T10:58:02Z

@DachengLi1 I would like to follow up on this. I'm having the same issue, running the same model using fastchat openai-server implementation. Getting the same outputs (some times some "A A A A A A A A A A A A" screaming) while running the latest version with the monkey patch applied.

Here are the requests I send to the endpoint and relative output:

curl http://localhost:8100/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer fattiicazzituoi" \
  -d '{
     "model": "longchat-13b-16k",
     "messages": [{"role": "user", "content": "Say this is a test."}],
     "temperature": 0.3, "max_tokens": 200
   }'

{"id":"chatcmpl-3tF6uZ7GXm54dmLwfGLQ3y","object":"chat.completion","created":1689243774,"model":"lmsys/longchat-13b-16k","choices":[{"index":0,"message":{"role":"assistant","content":"A A A A A A A A A A A A A A A A A A A A"},"finish_reason":"stop"}],"usage":{"prompt_tokens":45,"total_tokens":64,"completion_tokens":19}}

But if I use the "completions" (non-chat) endpoint the model works "correctly" (or at least it does not scream at me):

curl http://localhost:8100/v1/completions \ 
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer aaaaaaaaaaaa" \
  -d '{
    "model": "lmsys/longchat-13b-16k",
    "prompt": "Say this is a test.",
    "max_tokens": 20,
    "temperature": 0.5
  }'

{"id":"cmpl-sBiuu78WYegWDnU3WDmFmF","object":"text_completion","created":1689245357,"model":"lmsys/longchat-13b-16k","choices":[{"index":0,"text":"\nYou are a test.\n\n\n\n\n\n\n\n\n\n\n\n\n\n","logprobs":null,"finish_reason":"length"}],"usage":{"prompt_tokens":7,"total_tokens":26,"completion_tokens":19}}

TL;DR: LongChat-13B-16K goes like this:

DachengLi1 · 2023-07-13T16:08:40Z

@scuty2000 Fun image lol.
@merrymercy do you have an idea on this. Is there a difference in login in completions versus chat completition (e.g. load_8_bit, patching)?

scuty2000 · 2023-07-13T16:32:24Z

@DachengLi1 I don't know if this can help, but I suspect is related to the int8 quantization. Using the 7B version not quantized works pretty well.

DachengLi1 · 2023-07-13T16:40:55Z

@scuty2000 Yes, I also heard it elsewhere.

scuty2000 · 2023-07-19T11:42:07Z

Any update on this?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

longchat-13b-16k chat not work #14

longchat-13b-16k chat not work #14

ahkimkoo commented Jul 5, 2023

DachengLi1 commented Jul 5, 2023

ahkimkoo commented Jul 6, 2023

DachengLi1 commented Jul 6, 2023

musabgultekin commented Jul 6, 2023

scuty2000 commented Jul 13, 2023 •

edited

Loading

DachengLi1 commented Jul 13, 2023

scuty2000 commented Jul 13, 2023 •

edited

Loading

DachengLi1 commented Jul 13, 2023

scuty2000 commented Jul 19, 2023

longchat-13b-16k chat not work #14

longchat-13b-16k chat not work #14

Comments

ahkimkoo commented Jul 5, 2023

DachengLi1 commented Jul 5, 2023

ahkimkoo commented Jul 6, 2023

DachengLi1 commented Jul 6, 2023

musabgultekin commented Jul 6, 2023

scuty2000 commented Jul 13, 2023 • edited Loading

DachengLi1 commented Jul 13, 2023

scuty2000 commented Jul 13, 2023 • edited Loading

DachengLi1 commented Jul 13, 2023

scuty2000 commented Jul 19, 2023

scuty2000 commented Jul 13, 2023 •

edited

Loading

scuty2000 commented Jul 13, 2023 •

edited

Loading