-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault (core dumped) - 0.2.58 #1319
Comments
We have seen this issue in the |
Thanks! |
Here is a rough bug trace. It seems that any context longer than a certain bounder would cause the crash down
Probably relevant to ggml-org/llama.cpp#6017 (comment) |
I'll work on a fix once I get a chance to repro, in the meantime let me share the recipe for debugging this (on Linux at least). # install the package in debug mode to maintain symbols (may need to add additional cmake flags for specific backends)
python3 -m pip install \
--verbose \
--config-settings cmake.args='-DCMAKE_BUILD_TYPE=Debug;-DCMAKE_CXX_FLAGS=-g3;-DCMAKE_C_FLAGS=-g3' \
--config-settings cmake.verbose=true \
--config-settings logging.level=INFO \
--config-settings install.strip=false \
--editable .
# run test script with gdb
gdb --args python3 test_script.py |
|
@anakin87 ah okay I think I see what it is, can you set |
Long-term I'll set up some tests that use Qwen1.5 0.5B or some other small model to smoke test for issues like this. |
@abetlen setting |
@anakin87 should be fixed now in |
@abetlen I still seem to be seeing this (or a related error) when I try upgrading to On Windows (and Python 3.12), our tests which try to use a Llama model are failing. Sample output:
A similar error when fetching the logits appears in our MacOS builds (and I'm just waiting to see if the Ubuntu build also fails). This continues to work with v0.2.57. |
I think I have the same issue (see #1326) and setting |
* feat: add support for KV cache quantization options (abetlen#1307) * add KV cache quantization options abetlen#1220 abetlen#1305 * Add ggml_type * Use ggml_type instead of string for quantization * Add server support --------- Co-authored-by: Andrei Betlen <[email protected]> * fix: Changed local API doc references to hosted (abetlen#1317) * chore: Bump version * fix: last tokens passing to sample_repetition_penalties function (abetlen#1295) Co-authored-by: ymikhaylov <[email protected]> Co-authored-by: Andrei <[email protected]> * feat: Update llama.cpp * fix: segfault when logits_all=False. Closes abetlen#1319 * feat: Binary wheels for CPU, CUDA (12.1 - 12.3), Metal (abetlen#1247) * Generate binary wheel index on release * Add total release downloads badge * Update download label * Use official cibuildwheel action * Add workflows to build CUDA and Metal wheels * Update generate index workflow * Update workflow name * feat: Update llama.cpp * chore: Bump version * fix(ci): use correct script name * docs: LLAMA_CUBLAS -> LLAMA_CUDA * docs: Add docs explaining how to install pre-built wheels. * docs: Rename cuBLAS section to CUDA * fix(docs): incorrect tool_choice example (abetlen#1330) * feat: Update llama.cpp * fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes abetlen#1328 abetlen#1314 * fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes abetlen#1328 Closes abetlen#1314 * feat: Update llama.cpp * fix: Always embed metal library. Closes abetlen#1332 * feat: Update llama.cpp * chore: Bump version --------- Co-authored-by: Limour <[email protected]> Co-authored-by: Andrei Betlen <[email protected]> Co-authored-by: lawfordp2017 <[email protected]> Co-authored-by: Yuri Mikhailov <[email protected]> Co-authored-by: ymikhaylov <[email protected]> Co-authored-by: Sigbjørn Skjæret <[email protected]>
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
I'm trying to create a completion using a GGUF model
Current Behavior
Segmentation fault (core dumped) on 0.2.58
(works well on 0.2.57)
Environment and Context
Failure Information (for bugs)
Segmentation fault (core dumped)
The text was updated successfully, but these errors were encountered: