Llama-bench: allow benchmarking lora impact #11410

IMbackK · 2025-01-25T10:12:59Z

Allows benchmarking a with a lora loaded.

As i dont know how loras are applied exactly in lcpp (esp in the case of quantized weights, where it seams just merging in ram would not be trivial) im not sure if that makes sense to do, so this is a draft.

slaren · 2025-01-25T13:46:17Z

Yes absolutely, having a way to measure performance with one or more loras applied would be very useful. Loras are not merged into the weights, they are applied during inference, see the function llm_build_lora_mm for an example of how this is done.

IMbackK · 2025-01-25T14:04:40Z

Great, i will undraft this then.

This pr only allows one lora as the llama-bench permutation interface, ie you can specify --lora none --lora some_file and it will bench first with no lora and then with a lora or --lora a,b as with any bench parameter, makes it a bit hard to come up with a sane way to do 2d permutation with loras.

slaren · 2025-01-25T14:10:40Z

Yes, I don't think it is very useful to have the loras as part of the test grid. Loras are strongly tied to a model, I don't think they can be separated. I would suggest adding some syntax to -m such as -m model.gguf+lora.gguf+lora2.gguf, then it would be possible to test at the same time e.g. a model, and the same model with a lora applied.

IMbackK · 2025-01-28T14:01:47Z

Please dont merge as is, i will make improvements as suggested by @slaren

Llama-bench: allow benchmarking lora impact

044d499

github-actions bot added the examples label Jan 25, 2025

IMbackK marked this pull request as ready for review January 25, 2025 14:04

ericcurtin approved these changes Jan 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Llama-bench: allow benchmarking lora impact #11410

Llama-bench: allow benchmarking lora impact #11410

IMbackK commented Jan 25, 2025 •

edited

Loading

slaren commented Jan 25, 2025

IMbackK commented Jan 25, 2025 •

edited

Loading

slaren commented Jan 25, 2025

IMbackK commented Jan 28, 2025

Llama-bench: allow benchmarking lora impact #11410

Are you sure you want to change the base?

Llama-bench: allow benchmarking lora impact #11410

Conversation

IMbackK commented Jan 25, 2025 • edited Loading

slaren commented Jan 25, 2025

IMbackK commented Jan 25, 2025 • edited Loading

slaren commented Jan 25, 2025

IMbackK commented Jan 28, 2025

IMbackK commented Jan 25, 2025 •

edited

Loading

IMbackK commented Jan 25, 2025 •

edited

Loading