Does recurrentgemma support quantization? #2450

daiwk · 2024-11-15T10:56:25Z

https://developer.nvidia.com/zh-cn/blog/nvidia-tensorrt-llm-revs-up-inference-for-google-gemma/

This post says gemma supports quantization, so does recurrentgemma support quantization?

byshiue · 2024-11-18T02:18:51Z

If its model architecture is same to gemma, then it should be supported.

daiwk · 2024-11-30T03:01:59Z

If its model architecture is same to gemma, then it should be supported.
@byshiue its model arch is different with gemma, because it has a mamba-like RNN architecture, rg-lru.

hello-11 added question Further information is requested triaged Issue has been triaged by maintainers labels Nov 18, 2024

hello-11 assigned byshiue Nov 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does recurrentgemma support quantization? #2450

Does recurrentgemma support quantization? #2450

daiwk commented Nov 15, 2024

byshiue commented Nov 18, 2024

daiwk commented Nov 30, 2024

Does recurrentgemma support quantization? #2450

Does recurrentgemma support quantization? #2450

Comments

daiwk commented Nov 15, 2024

byshiue commented Nov 18, 2024

daiwk commented Nov 30, 2024