- Add a new crate to the programs directory.
- Add the benchmark to CI.
This is called a "guest program" because it is intended to be run on the OpenVM architecture and not on the machine doing the compilation (the "host machine"), although we will discuss shortly how you can still test it locally on the host machine.
The guest program should be a no_std
Rust crate. As long as it is no_std
, you can import any other
no_std
crates and write Rust as you normally would. Import the openvm
library crate to use openvm
intrinsic functions (for example openvm::io::*
).
The guest program also needs #![no_main]
because no_std
does not have certain default handlers. These are provided by the openvm::entry!
macro. You should still create a main
function, and then add openvm::entry!(main)
for the macro to set up the function to run as a normal main
function. While the function can be named anything when target_os = "zkvm"
, for compatibility with testing when std
feature is enabled (see below), you should still name it main
.
To support host machine execution, the top of your guest program should have:
#![cfg_attr(not(feature = "std"), no_main)]
#![cfg_attr(not(feature = "std"), no_std)]
You can copy from fibonacci to get started.
The guest program crate should not be included in the main repository workspace. Instead the guest
Cargo.toml
should have [workspace]
at the top to keep it standalone. Your IDE will likely not
lint or use rust-analyzer on the crate while in the workspace, so the recommended setup is to open a separate IDE workspace from the directory of the guest program.
Our proving benchmarks are written as standalone rust binaries. Add one by making a new file in bin by following the fibonacci example. We currently only run aggregation proofs when feature "aggregation" is on (off by default). Any general benchmarking utility functions can be added to the library in src
. There are utility functions build_bench_program
which compiles the guest program crate with target set to openvm
and reads the output RISC-V ELF file.
This can then be fed into bench_from_exe
which will generate a proof of the execution of the ELF (any other VmExe
) from a given VmConfig
.
Inputs must be directly provided to the bench_from_exe
function: the input_stream: Vec<Vec<F>>
is a vector of vectors, where input_stream[i]
will be what is provided to the guest program on the i
-th call of openvm::io::read_vec()
. Currently you must manually convert from u8
to F
using AbstractField::from_canonical_u8
.
You can find an example of passing in a single Vec<u8>
input in base64_json.
You can test by directly running cargo run --bin <bench_name>
which will run the program in the OpenVM runtime. For a more convenient dev experience, we created the openvm
crate such that it will still build and run normally on the host machine. From the guest program root directory, you can run
cargo run --features std
To run the program on host (in normal rust runtime). This requires the std library, which is enabled by the std
feature. To ensure that your guest program is still no_std
, you should not make std
the default feature.
The behavior of openvm::io::read_vec
and openvm::io::read
differs when run on OpenVM or the host machine. As mentioned above, when running on OpenVM, the inputs must be provided in the bench_from_exe
function.
On the host machine, when you run cargo run --features std
, each read_vec
call will read bytes to end from stdin. For example here is how you would run the fibonacci guest program:
# from programs/fibonacci
printf '\xA0\x86\x01\x00\x00\x00\x00\x00' | cargo run --features std
(Alternatively, you can temporarily comment out the read_vec
call and use include_bytes!
or include_str!
to directly include your input. Use core::hint::black_box
to prevent the compiler from optimizing away the input.)
By default, if you run cargo build
or cargo run
from the guest program root directory, it will
build with target set to your host machine, while running bench_from_exe
in the bench script will build with target set to openvm
. If you want to directly build for openvm
(more specifically a special RISC-V target), copy the .cargo
folder from here to the guest program root directory and uncomment the .cargo/config.toml
file. (This config is equivalent to what the build_bench_program
function does behind the scenes.) You can then cargo build
or cargo build --release
and it will output a RISC-V ELF file to target/riscv32im-risc0-zkvm-elf/release/*
. You can install cargo-binutils to be able to disassemble the ELF file:
rust-objdump -d target/riscv32im-risc0-zkvm-elf/release/openvm-fibonacci-program
Running a benchmark locally is simple. Just run the following command:
OUTPUT_PATH="metrics.json" cargo run --release --bin <benchmark_name>
where <benchmark_name>.rs
is one of the files in src/bin
.
The OUTPUT_PATH
environmental variable shouuld be set to the file path where you want the collected metrics to be written to. If unset, then metrics are not printed to file.
To run a benchmark with the leaf aggregation, add --features aggregation
to the above command.
To add the benchmark to CI, update the ci/benchmark-config.json file and set it's configuration parameters. To make the benchmark run on every PR, follow the existing format with e2e_bench = false
. To make the benchmark run only when label run_benchmark_e2e
is present, set e2e_bench = true
and specify values for root_log_blowup
and internal_log_blowup
.
The benchmarks.yml
file reads this JSON and generates a matrix of inputs for the .github/workflows/benchmark-call.yml file, a reusable workflow for running the benchmark, collecting metrics, and storing and displaying results.
We use the metrics
crate to collect metrics. Use gauge!
for timers and counter!
for numerical counters (e.g., cell count or opcode count). We distinguish different metrics using labels.
The most convenient way to add labels is to couple it with tracing
spans: On a line like
info_span!("Fibonacci Program", group = "fibonacci_program").in_scope(|| {
The "Fibonacci Program"
is the label for tracing logs, only for display purposes. The group = "fibonacci_program"
adds a label group -> "fibonacci_program"
to any metrics within the span.
Different labels can be added to provide more granularity on the metrics, but the group
label should always be the top level label used to distinguish different proof workloads.
Most benchmarks are binaries that run once since proving benchmarks take longer. For smaller benchmarks, such as to benchmark VM runtime, we use Criterion. These are in the benches
directory.
cargo bench --bench fibonacci_execute
cargo bench --bench regex_execute
will run the normal criterion benchmark.
We profile using executables without criterion in examples
. To prevent the ELF build time from being included in the benchmark, we pre-build the ELF using the CLI. Check that the included ELF file in examples
is up to date before proceeding.
To generate flamegraphs, install cargo-flamegraph
and run:
cargo flamegraph --example regex_execute --profile=profiling
will generate a flamegraph at flamegraph.svg
without running any criterion analysis.
On MacOS, you will need to run the above command with sudo
.
To use samply, install it and then we must first build the executable.
cargo build --example regex_execute --profile=profiling
Then, run:
samply record ../target/profiling/examples/regex_execute
It will open an interactive UI in your browser (currently only Firefox and Chrome are supported). See the samply github page for more information.