-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reading GGUF metadata with gguf-dump.py does not work for i-quants #5809
Labels
Comments
countzero
changed the title
Reading GGUF metadata with .\gguf-py\scripts\gguf-dump.py does not work for i-quants
Reading GGUF metadata with gguf-dump.py does not work for i-quants
Mar 1, 2024
Yes, this could be added as extra functionality to the |
ggerganov
added
enhancement
New feature or request
good first issue
Good for newcomers
and removed
bug-unconfirmed
labels
Mar 1, 2024
I'm working on this, hope to have a PR ready Sunday evening EU time. |
@Nindaleth & @ggerganov Thank you for the quick fix! It now works as expected for i-quants: python .\gguf-py\scripts\gguf-dump.py --no-tensors .\models\miqu-1-70b-sf.Q5_K_M.gguf
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The gguf-dump.py script in the llama.cpp release b2297 is missing support for i-quants.
Steps to reproduce
IQ*
format (e.g., miqu-1-70b-Requant-b2131-iMat-c32_ch400-IQ1_S_v3.gguf).\models\miqu-1-70b-sf.IQ1_S.gguf
Expected behaviour
I expect the Python
gguf-py
library to support all possible GGUF formats.Working example for k-quants:
Use-Case
I am extracting the metadata from any given GGUF model to automatically calculate the optimal runtime arguments for the server in the following PowerShell script: https://github.com/countzero/windows_llama.cpp/blob/v1.12.0/examples/server.ps1#L104
Question
@ggerganov Is there another way to only dump the metadata from a given GGUF model? Perhaps this could be an
--inspect
option of the gguf binary?The text was updated successfully, but these errors were encountered: