Skip to content

Latest commit

 

History

History
56 lines (38 loc) · 1.23 KB

README.md

File metadata and controls

56 lines (38 loc) · 1.23 KB

birr

Batch InfeRence Runtime

Overview

A simplified local-only release of the toolchain used by Ai2 to perform large-scale inference through LLMs and VLMs.

This image orchestrates inference jobs based on a user-provided YAML config file.

It leverages:

  • ray for concurrency management
  • vllm for inference backend

It consumes JSONL work files and outputs result files to a chosen destination.

Usage

Project Setup

cd <project_root>
python3 -m venv venv
source venv/bin/activate
pip install .[batch_inference,vllm]

Prepare Your Data

Records you want to run inference over must be partitioned into one or more jsonl files in a flat directory. Each row should have the following structure:

{"chat_messages": [{"role":  "user", "content":  "asdf"}]}

Additional fields may be provided in each row (e.g. ids, metadata), and will be preserved in output.

Define A Job

Author a configuration file for your job, see example file here:

https://github.com/allenai/birr/blob/main/configs/inference/example.yaml

Run Your Job

# in activated venv
python src/birr/batch_inference/runner.py --config-file <path_to_config_file>