Bug: Due to errors caused by adding `logprobs` #1328

devcxl · 2024-04-04T21:17:45Z

Prerequisites

Please answer the following questions for yourself before submitting an issue.

I am running the latest code. Development is very rapid so there are no tagged versions as of now.
I carefully followed the README.md.
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

I hope to fix this bug as soon as possible.

Current Behavior

In #1311, the logprobs field was not handled correctly.

Environment and Context

Python 3.11.8
llama-cpp-python[all]==v0.2.59
GNU Make 4.4.1
g++ (GCC) 13.2.1 20230801

Failure Information (for bugs)

In #1311, the logprobs field was introduced. Due to the lack of logprobs returned when using function calls, it resulted in the malfunction of function calls.

Steps to Reproduce

Install llama-cpp-python[all]==v0.2.57

edit config.json and run server python3 -m llama_cpp.server --config_file config.json

    {
        "model": "models/Qwen1.5/qwen1_5-4b-chat-q4_k_m.gguf",
        "model_alias": "qwen1_5-4b-chat-q4_k_m",
        "chat_format": "chatml-function-calling",
        "n_gpu_layers": -1,
        "offload_kqv": true,
        "n_threads": 12,
        "n_batch": 512,
        "n_ctx": 2048
    },

Run the first function call example in this notebook.
The execution is normal.
Install llama-cpp-python[all]==v0.2.59
Run the first function call example in this notebook.
Internal Server Error

Exception: 2 validation errors:
{'type': 'missing', 'loc': ('response', 'typed-dict', 'choices', 0, 'logprobs'), 'msg': 'Field required', 'input': {'finish_reason': 'tool_calls', 'index': 0, 'message': {'role': 'assistant', 'content': None, 'tool_calls': [{'id': 'call__0_get_current_weather_cmpl-58520529-a626-4a1e-8b4b-1fca9dd2d68a', 'type': 'function', 'function': {'name': 'get_current_weather', 'arguments': '{ "location": "San Francisco, Tokyo, Paris" , "unit": "fahrenheit"}'}}], 'function_call': {'name': 'get_current_weather:', 'arguments': '{ "location": "San Francisco, Tokyo, Paris" , "unit": "fahrenheit"}'}}}, 'url': 'https://errors.pydantic.dev/2.6/v/missing'}
{'type': 'string_type', 'loc': ('response', 'str'), 'msg': 'Input should be a valid string', 'input': {'id': 'chatcmpl-5ce5ae67-c028-427f-a8bb-fe3ff94eb934', 'object': 'chat.completion', 'created': 1712263433, 'model': 'qwen1_5-4b-chat-q4_k_m', 'choices': [{'finish_reason': 'tool_calls', 'index': 0, 'message': {'role': 'assistant', 'content': None, 'tool_calls': [{'id': 'call__0_get_current_weather_cmpl-58520529-a626-4a1e-8b4b-1fca9dd2d68a', 'type': 'function', 'function': {'name': 'get_current_weather', 'arguments': '{ "location": "San Francisco, Tokyo, Paris" , "unit": "fahrenheit"}'}}], 'function_call': {'name': 'get_current_weather:', 'arguments': '{ "location": "San Francisco, Tokyo, Paris" , "unit": "fahrenheit"}'}}}], 'usage': {'completion_tokens': 22, 'prompt_tokens': 31, 'total_tokens': 53}}, 'url': 'https://errors.pydantic.dev/2.6/v/string_type'}

Traceback (most recent call last):
  File "/home/devcxl/download/llama-server/.evm/lib/python3.11/site-packages/llama_cpp/server/errors.py", line 171, in custom_route_handler
    response = await original_route_handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/devcxl/download/llama-server/.evm/lib/python3.11/site-packages/fastapi/routing.py", line 296, in app
    content = await serialize_response(
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/devcxl/download/llama-server/.evm/lib/python3.11/site-packages/fastapi/routing.py", line 155, in serialize_response
    raise ResponseValidationError(
fastapi.exceptions.ResponseValidationError: 2 validation errors:
  {'type': 'missing', 'loc': ('response', 'typed-dict', 'choices', 0, 'logprobs'), 'msg': 'Field required', 'input': {'finish_reason': 'tool_calls', 'index': 0, 'message': {'role': 'assistant', 'content': None, 'tool_calls': [{'id': 'call__0_get_current_weather_cmpl-58520529-a626-4a1e-8b4b-1fca9dd2d68a', 'type': 'function', 'function': {'name': 'get_current_weather', 'arguments': '{ "location": "San Francisco, Tokyo, Paris" , "unit": "fahrenheit"}'}}], 'function_call': {'name': 'get_current_weather:', 'arguments': '{ "location": "San Francisco, Tokyo, Paris" , "unit": "fahrenheit"}'}}}, 'url': 'https://errors.pydantic.dev/2.6/v/missing'}
  {'type': 'string_type', 'loc': ('response', 'str'), 'msg': 'Input should be a valid string', 'input': {'id': 'chatcmpl-5ce5ae67-c028-427f-a8bb-fe3ff94eb934', 'object': 'chat.completion', 'created': 1712263433, 'model': 'qwen1_5-4b-chat-q4_k_m', 'choices': [{'finish_reason': 'tool_calls', 'index': 0, 'message': {'role': 'assistant', 'content': None, 'tool_calls': [{'id': 'call__0_get_current_weather_cmpl-58520529-a626-4a1e-8b4b-1fca9dd2d68a', 'type': 'function', 'function': {'name': 'get_current_weather', 'arguments': '{ "location": "San Francisco, Tokyo, Paris" , "unit": "fahrenheit"}'}}], 'function_call': {'name': 'get_current_weather:', 'arguments': '{ "location": "San Francisco, Tokyo, Paris" , "unit": "fahrenheit"}'}}}], 'usage': {'completion_tokens': 22, 'prompt_tokens': 31, 'total_tokens': 53}}, 'url': 'https://errors.pydantic.dev/2.6/v/string_type'}

INFO:     127.0.0.1:36362 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error

The text was updated successfully, but these errors were encountered:

devcxl · 2024-04-04T21:25:49Z

Additionally, I believe it's advisable to utilize GitHub Actions for automated testing before merging branches to ensure the stability of the project.

abetlen · 2024-04-05T01:44:49Z

@devcxl thanks for reporting. You're right and this really could've been caught by static analysis as well through type checking. For what it's worth the PR is actually more correct with respect to the current OpenAI API spec, I'll resolve the type errors in the chat format and fix this!

devcxl · 2024-04-05T14:00:12Z

OK, thanks.

…onary, minor type issues. Closes #1328 Closes #1314

…onary, minor type issues. Closes abetlen#1328 Closes abetlen#1314

* feat: add support for KV cache quantization options (abetlen#1307) * add KV cache quantization options abetlen#1220 abetlen#1305 * Add ggml_type * Use ggml_type instead of string for quantization * Add server support --------- Co-authored-by: Andrei Betlen <[email protected]> * fix: Changed local API doc references to hosted (abetlen#1317) * chore: Bump version * fix: last tokens passing to sample_repetition_penalties function (abetlen#1295) Co-authored-by: ymikhaylov <[email protected]> Co-authored-by: Andrei <[email protected]> * feat: Update llama.cpp * fix: segfault when logits_all=False. Closes abetlen#1319 * feat: Binary wheels for CPU, CUDA (12.1 - 12.3), Metal (abetlen#1247) * Generate binary wheel index on release * Add total release downloads badge * Update download label * Use official cibuildwheel action * Add workflows to build CUDA and Metal wheels * Update generate index workflow * Update workflow name * feat: Update llama.cpp * chore: Bump version * fix(ci): use correct script name * docs: LLAMA_CUBLAS -> LLAMA_CUDA * docs: Add docs explaining how to install pre-built wheels. * docs: Rename cuBLAS section to CUDA * fix(docs): incorrect tool_choice example (abetlen#1330) * feat: Update llama.cpp * fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes abetlen#1328 abetlen#1314 * fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes abetlen#1328 Closes abetlen#1314 * feat: Update llama.cpp * fix: Always embed metal library. Closes abetlen#1332 * feat: Update llama.cpp * chore: Bump version --------- Co-authored-by: Limour <[email protected]> Co-authored-by: Andrei Betlen <[email protected]> Co-authored-by: lawfordp2017 <[email protected]> Co-authored-by: Yuri Mikhailov <[email protected]> Co-authored-by: ymikhaylov <[email protected]> Co-authored-by: Sigbjørn Skjæret <[email protected]>

abetlen added the bug Something isn't working label Apr 5, 2024

abetlen closed this as completed in 49bc66b Apr 5, 2024

abetlen added a commit that referenced this issue Apr 5, 2024

fix: missing logprobs in response, incorrect response type for functi…

1ae3abb

…onary, minor type issues. Closes #1328 Closes #1314

xhedit pushed a commit to xhedit/llama-cpp-conv that referenced this issue Apr 6, 2024

fix: missing logprobs in response, incorrect response type for functi…

9350e31

…onary, minor type issues. Closes abetlen#1328 Closes abetlen#1314

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: Due to errors caused by adding `logprobs` #1328

Bug: Due to errors caused by adding `logprobs` #1328

devcxl commented Apr 4, 2024 •

edited

Loading

devcxl commented Apr 4, 2024

abetlen commented Apr 5, 2024

devcxl commented Apr 5, 2024

Bug: Due to errors caused by adding logprobs #1328

Bug: Due to errors caused by adding logprobs #1328

Comments

devcxl commented Apr 4, 2024 • edited Loading

Prerequisites

Expected Behavior

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

devcxl commented Apr 4, 2024

abetlen commented Apr 5, 2024

devcxl commented Apr 5, 2024

Bug: Due to errors caused by adding `logprobs` #1328

Bug: Due to errors caused by adding `logprobs` #1328

devcxl commented Apr 4, 2024 •

edited

Loading