gguf : use Qn_K for k-quants instead of KQn #837
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
#822 (by @mofosyne) has introduced a naming convention for GGUF model files, but the way it names k-quants doesn't follow the established practice (all other places where k-quants are named use
Qn_K
wheren
is the number of bits per weight excluding the scales).rg -i 'KQ\d'
doesn't return anything related to quants except for this recently-added section, whilerg -i 'Q\d_K'
returns a lot of things related to k-quants when run inggml
andllama.cpp
reposSo this renames
KQ2
toQ2_K
, for consistency. This should avoid unnecessary confusion.(note that the recently-added wiki page about "tensor encoding schemes" will need to be updated too, since it is the only other place I found to also use this
KQ<X>
naming scheme)