lightbench

A lightweight benchmarking framework for LLMs.

Overview

lightbench is designed to offer both interactive and automated benchmarking for large language models, enabling comprehensive evaluation of code generation and question answering capabilities.

Key Features

Human Evaluation: Interactive chat interface.
Automatic Evaluations: Automated tests for code and text outputs.
Extensible Architecture: Easy integration of new evaluators and metrics.

Installation

Dependencies:
Ensure you have Python 3.8+ installed.
Setup Environment:
Run the installation script:
```
bash install_dependencies.sh
```
Configure Environment:
Create a .env file with your OPENAI_API_KEY, HUGGINGFACE_TOKEN, and MODEL_NAME.

Usage

Interactive Chat:
Run chat.py to start the chat interface. This will use the model specified by MODEL_NAME in the .env file. Below is an example of a chat using Llama-3.2-3B-Instruct, running on a GTX 1080 TI.
Automated Evaluations:
See examples in examples.ipynb.

Project Structure

api: API definitions and endpoints.
evaluators: Modules for both code and text evaluation.
loaders: Tools to load and manage models.
metric: Available metrics for local and API based models.

Citation

Paper. If you refer to the research paper related to this project, please cite:

@inproceedings{naudot2025performance,
  author    = {Filip Naudot},
  title     = {Performance and Computational Demands of LLMs: Impact of Model Size and Quantization},
  booktitle = {Proceedings of Umeå’s 28th Student Conference in Computing Science (USCCS 2025)},
  editor    = {Thomas Hellström},
  year      = {2025},
  publisher = {Umeå University, Sweden},
  note      = {Branch \texttt{conf-paper} used for paper results},
}

Repository. If you use lightbench in your research, please cite the repository:

@misc{lightbench2025,
  author    = {Filip Naudot},
  title     = {lightbench},
  year      = {2025},
  howpublished = {\url{https://github.com/filipnaudot/lightbench}},
}

License

Distributed under the MIT License. See LICENSE for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 188 Commits
api		api
data		data
evaluators		evaluators
loaders		loaders
metrics		metrics
readme_assets		readme_assets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
chat.py		chat.py
cuda_test.py		cuda_test.py
examples.ipynb		examples.ipynb
install_dependencies.sh		install_dependencies.sh
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

lightbench

Table of Contents

Overview

Key Features

Installation

Usage

Project Structure

Citation

License

About

Languages

License

filipnaudot/lightbench

Folders and files

Latest commit

History

Repository files navigation

lightbench

Table of Contents

Overview

Key Features

Installation

Usage

Project Structure

Citation

License

About

Resources

License

Stars

Watchers

Forks

Languages