gguf : add support for I64 and F64 arrays #6062

certik · 2024-03-14T18:58:06Z

GGML currently does not support I64 or F64 arrays and they are not often used in machine learning, however if in the future the need arises, it would be nice to add them now, so that the types are next to the other types I8, I16, I32 in the enums, and it also reserves their type number.

Furthermore, with this addition the GGUF format becomes very usable for most computational applications of NumPy (being compatible with the most common NumPy dtypes: i8, i16, i32, i64, f32, f64), providing a faster, and more versatile alternative to the npz format, and a simpler alternative to the hdf5 format.

The change in this PR seems small, not significantly increasing the maintenance burden. I tested this from Python using GGUFWriter/Reader and gguf-dump, as well as from C, everything seems to work.

GGML currently does not support I64 or F64 arrays and they are not often used in machine learning, however if in the future the need arises, it would be nice to add them now, so that the types are next to the other types I8, I16, I32 in the enums, and it also reserves their type number. Furthermore, with this addition the GGUF format becomes very usable for most computational applications of NumPy (being compatible with the most common NumPy dtypes: i8, i16, i32, i64, f32, f64), providing a faster, and more versatile alternative to the `npz` format, and a simpler alternative to the `hdf5` format. The change in this PR seems small, not significantly increasing the maintenance burden. I tested this from Python using GGUFWriter/Reader and `gguf-dump`, as well as from C, everything seems to work.

certik · 2024-03-15T14:00:31Z

@ggerganov thanks for the review and merging this.

* gguf : add support for I64 and F64 arrays GGML currently does not support I64 or F64 arrays and they are not often used in machine learning, however if in the future the need arises, it would be nice to add them now, so that the types are next to the other types I8, I16, I32 in the enums, and it also reserves their type number. Furthermore, with this addition the GGUF format becomes very usable for most computational applications of NumPy (being compatible with the most common NumPy dtypes: i8, i16, i32, i64, f32, f64), providing a faster, and more versatile alternative to the `npz` format, and a simpler alternative to the `hdf5` format. The change in this PR seems small, not significantly increasing the maintenance burden. I tested this from Python using GGUFWriter/Reader and `gguf-dump`, as well as from C, everything seems to work. * Fix compiler warnings

Bring `GGMLQuantizationType` up to date; adds `I8`, `I16`, `I32`, `I64`, `F64`, `IQ1_M` and `BF16`. Added in: * ggerganov/llama.cpp#6045 * ggerganov/llama.cpp#6062 * ggerganov/llama.cpp#6302 * ggerganov/llama.cpp#6412

certik mentioned this pull request Mar 14, 2024

Binary format choice certik/mlc#34

Open

Fix compiler warnings

52837f0

ggerganov approved these changes Mar 15, 2024

View reviewed changes

ggerganov merged commit 7ce2c77 into ggerganov:master Mar 15, 2024
57 of 61 checks passed

certik deleted the gguf_py_i64_f64 branch March 15, 2024 13:48

mofosyne added Tensor Encoding Scheme https://github.com/ggerganov/llama.cpp/wiki/Tensor-Encoding-Schemes Review Complexity : High Generally require indepth knowledge of LLMs or GPUs labels May 25, 2024

CISC mentioned this pull request Jun 3, 2024

Update GGUF quantization types huggingface/huggingface.js#729

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gguf : add support for I64 and F64 arrays #6062

gguf : add support for I64 and F64 arrays #6062

certik commented Mar 14, 2024

certik commented Mar 15, 2024

gguf : add support for I64 and F64 arrays #6062

gguf : add support for I64 and F64 arrays #6062

Conversation

certik commented Mar 14, 2024

certik commented Mar 15, 2024