-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: Due to errors caused by adding logprobs
#1328
Comments
Additionally, I believe it's advisable to utilize GitHub Actions for automated testing before merging branches to ensure the stability of the project. |
@devcxl thanks for reporting. You're right and this really could've been caught by static analysis as well through type checking. For what it's worth the PR is actually more correct with respect to the current OpenAI API spec, I'll resolve the type errors in the chat format and fix this! |
OK, thanks. |
…onary, minor type issues. Closes abetlen#1328 Closes abetlen#1314
* feat: add support for KV cache quantization options (abetlen#1307) * add KV cache quantization options abetlen#1220 abetlen#1305 * Add ggml_type * Use ggml_type instead of string for quantization * Add server support --------- Co-authored-by: Andrei Betlen <[email protected]> * fix: Changed local API doc references to hosted (abetlen#1317) * chore: Bump version * fix: last tokens passing to sample_repetition_penalties function (abetlen#1295) Co-authored-by: ymikhaylov <[email protected]> Co-authored-by: Andrei <[email protected]> * feat: Update llama.cpp * fix: segfault when logits_all=False. Closes abetlen#1319 * feat: Binary wheels for CPU, CUDA (12.1 - 12.3), Metal (abetlen#1247) * Generate binary wheel index on release * Add total release downloads badge * Update download label * Use official cibuildwheel action * Add workflows to build CUDA and Metal wheels * Update generate index workflow * Update workflow name * feat: Update llama.cpp * chore: Bump version * fix(ci): use correct script name * docs: LLAMA_CUBLAS -> LLAMA_CUDA * docs: Add docs explaining how to install pre-built wheels. * docs: Rename cuBLAS section to CUDA * fix(docs): incorrect tool_choice example (abetlen#1330) * feat: Update llama.cpp * fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes abetlen#1328 abetlen#1314 * fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes abetlen#1328 Closes abetlen#1314 * feat: Update llama.cpp * fix: Always embed metal library. Closes abetlen#1332 * feat: Update llama.cpp * chore: Bump version --------- Co-authored-by: Limour <[email protected]> Co-authored-by: Andrei Betlen <[email protected]> Co-authored-by: lawfordp2017 <[email protected]> Co-authored-by: Yuri Mikhailov <[email protected]> Co-authored-by: ymikhaylov <[email protected]> Co-authored-by: Sigbjørn Skjæret <[email protected]>
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
I hope to fix this bug as soon as possible.
Current Behavior
In #1311, the
logprobs
field was not handled correctly.Environment and Context
Failure Information (for bugs)
In #1311, the logprobs field was introduced. Due to the lack of logprobs returned when using function calls, it resulted in the malfunction of function calls.
Steps to Reproduce
python3 -m llama_cpp.server --config_file config.json
The text was updated successfully, but these errors were encountered: