Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyper Threading Crate #1679

Merged
merged 8 commits into from
Mar 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
## Cairo-VM Changelog

#### Upcoming Changes

* feat: Create hyper_threading crate to benchmark the `cairo-vm` in a hyper-threaded environment [#1679](https://github.com/lambdaclass/cairo-vm/pull/1679)

* feat: add a `--tracer` option which hosts a web server that shows the line by line execution of cairo code along with memory registers [#1265](https://github.com/lambdaclass/cairo-vm/pull/1265)

* feat: Fix error handling in `initialize_state`[#1657](https://github.com/lambdaclass/cairo-vm/pull/1657)
Expand Down
13 changes: 11 additions & 2 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@ members = [
"hint_accountant",
"examples/wasm-demo",
"cairo1-run",
"cairo-vm-tracer"
"cairo-vm-tracer",
"examples/hyper_threading"
]
default-members = [
"cairo-vm-cli",
Expand Down
5 changes: 5 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ endif
compare_benchmarks_deps compare_benchmarks docs clean \
compare_trace_memory compare_trace compare_memory compare_pie compare_all_no_proof \
compare_trace_memory_proof compare_all_proof compare_trace_proof compare_memory_proof compare_air_public_input compare_air_private_input\
hyper-threading-benchmarks \
cairo_bench_programs cairo_proof_programs cairo_test_programs cairo_1_test_contracts cairo_2_test_contracts \
cairo_trace cairo-vm_trace cairo_proof_trace cairo-vm_proof_trace \
fuzzer-deps fuzzer-run-cairo-compiled fuzzer-run-hint-diff build-cairo-lang hint-accountant \ create-proof-programs-symlinks \
Expand Down Expand Up @@ -375,3 +376,7 @@ hint-accountant: build-cairo-lang

create-proof-programs-symlinks:
cd cairo_programs/proof_programs; ln -s ../*.cairo .

hyper-threading-benchmarks: cairo_bench_programs
cargo build -p hyper_threading --release && \
sh examples/hyper_threading/hyper-threading.sh
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -295,6 +295,12 @@ Run only the `iai_benchmark` benchmark suite with cargo:
cargo bench --bench iai_benchmark
```

Benchmark the `cairo-vm` in a hyper-threaded environment with the [`examples/hyper_threading/ crate`](examples/hyper_threading/)
```bash
make hyper-threading-benchmarks
```


## 📜 Changelog

Keeps track of the latest changes [here](CHANGELOG.md).
Expand Down
14 changes: 14 additions & 0 deletions examples/hyper_threading/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
[package]
name = "hyper_threading"
version.workspace = true
edition.workspace = true
license.workspace = true
repository.workspace = true
readme.workspace = true
keywords.workspace = true


[dependencies]
cairo-vm = { workspace = true }
rayon = "1.9.0"
tracing = "0.1.40"
11 changes: 11 additions & 0 deletions examples/hyper_threading/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Hyper-Threading Benchmarks for Cairo-VM

## Overview
This crate is designed to benchmark the performance of Cairo-VM in a hyper-threaded environment. By leveraging the [Rayon library](https://docs.rs/rayon/latest/rayon/), we can transform sequential computations into parallel ones, maximizing the utilization of available CPU cores.

### Running Benchmarks
To execute the benchmarks, navigate to the project's root directory and run the following command:

```bash
make hyper-threading-benchmarks
```
17 changes: 17 additions & 0 deletions examples/hyper_threading/hyper-threading.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/bin/bash

thread_counts=(1 2 4 6 8 10 12 16 24 32 )
binary="target/release/hyper_threading"


cmd="hyperfine -r 1"

# Build the command string with all thread counts
for threads in "${thread_counts[@]}"; do
# For hyperfine, wrap each command in 'sh -c' to correctly handle the environment variable
cmd+=" -n \"threads: ${threads}\" 'sh -c \"RAYON_NUM_THREADS=${threads} ${binary}\"'"
done
Comment on lines +9 to +13
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use hyperfine own options here:

  -P, --parameter-scan <VAR> <MIN> <MAX>                                                                                                                                                                                                      
          Perform benchmark runs for each value in the range MIN..MAX. Replaces                                                                                                                                                               
          the string '{VAR}' in each command by the current parameter value.                                                                                                                                                                  
                                                                                                                                                                                                                                              
            Example:  hyperfine -P threads 1 8 'make -j {threads}'                                                                                                                                                                            
                                                                                                                                                                                                                                              
          This performs benchmarks for 'make -j 1', 'make -j 2', …, 'make -j 8'.                                                                                                                                                              
                                                                                                                                                                                                                                              
          To have the value increase following different patterns, use shell                                                                                                                                                                  
          arithmetics.                                                                                                                                                                                                                        
                                                                                                                                                                                                                                              
            Example: hyperfine -P size 0 3 'sleep $((2**{size}))'                                                                                                                                                                             
                                                                                                                                                                                                                                              
          This performs benchmarks with power of 2 increases: 'sleep 1', 'sleep                                                                                                                                                               
          2', 'sleep 4', …                                                                                                                                                                                                                    
          The exact syntax may vary depending on your shell and OS.                                                                                                                                                                           
  -D, --parameter-step-size <DELTA>                                                                                                                                                                                                           
          This argument requires --parameter-scan to be specified as well.                                                                                                                                                                    
          Traverse the range MIN..MAX in steps of DELTA.                                                                                                                                                                                      
                                                                                                                                                                                                                                              
            Example:  hyperfine -P delay 0.3 0.7 -D 0.2 'sleep {delay}'                                                                                                                                                                       
                                                                                                                                                                                                                                              
          This performs benchmarks for 'sleep 0.3', 'sleep 0.5' and 'sleep 0.7'.                                                                                                                                                              
  -L, --parameter-list <VAR> <VALUES>                                                                                                                                                                                                         
          Perform benchmark runs for each value in the comma-separated list                                                                                                                                                                   
          VALUES. Replaces the string '{VAR}' in each command by the current                                                                                                                                                                  
          parameter value.                                                                                             
                                                                                                                       
          Example:  hyperfine -L compiler gcc,clang '{compiler} -O2 main.cpp'                                          
                                                                                                                       
          This performs benchmarks for 'gcc -O2 main.cpp' and 'clang -O2                                               
          main.cpp'.                                                                                                   
                                                                                                                       
          The option can be specified multiple times to run benchmarks for all
          possible parameter combinations.                    

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also no need to fire up the shell explicitly. The default behavior is to spawn one, unless you pass the -N option.


# Execute the hyperfine command
echo "Executing benchmark for all thread counts"
eval $cmd
67 changes: 67 additions & 0 deletions examples/hyper_threading/src/main.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
use cairo_vm::{
cairo_run::{cairo_run_program, CairoRunConfig},
hint_processor::builtin_hint_processor::builtin_hint_processor_definition::BuiltinHintProcessor,
types::program::Program,
};
use rayon::iter::{IntoParallelIterator, ParallelIterator};

// Define include_bytes_relative macro to prepend a relative path to the file names
macro_rules! include_bytes_relative {
($fname:expr) => {
include_bytes!(concat!("../../../cairo_programs/benchmarks/", $fname))
};
}

fn main() {
let mut programs = Vec::new();

let programs_bytes: [Vec<u8>; 18] = [
include_bytes_relative!("big_factorial.json").to_vec(),
include_bytes_relative!("big_fibonacci.json").to_vec(),
include_bytes_relative!("blake2s_integration_benchmark.json").to_vec(),
include_bytes_relative!("compare_arrays_200000.json").to_vec(),
include_bytes_relative!("dict_integration_benchmark.json").to_vec(),
include_bytes_relative!("field_arithmetic_get_square_benchmark.json").to_vec(),
include_bytes_relative!("integration_builtins.json").to_vec(),
include_bytes_relative!("keccak_integration_benchmark.json").to_vec(),
include_bytes_relative!("linear_search.json").to_vec(),
include_bytes_relative!("math_cmp_and_pow_integration_benchmark.json").to_vec(),
include_bytes_relative!("math_integration_benchmark.json").to_vec(),
include_bytes_relative!("memory_integration_benchmark.json").to_vec(),
include_bytes_relative!("operations_with_data_structures_benchmarks.json").to_vec(),
include_bytes_relative!("pedersen.json").to_vec(),
include_bytes_relative!("poseidon_integration_benchmark.json").to_vec(),
include_bytes_relative!("secp_integration_benchmark.json").to_vec(),
include_bytes_relative!("set_integration_benchmark.json").to_vec(),
include_bytes_relative!("uint256_integration_benchmark.json").to_vec(),
];

for bytes in &programs_bytes {
programs.push(Program::from_bytes(bytes.as_slice(), Some("main")).unwrap())
}

let start_time = std::time::Instant::now();

// Parallel execution of the program processing
programs.into_par_iter().for_each(|program| {
let cairo_run_config = CairoRunConfig {
entrypoint: "main",
trace_enabled: false,
relocate_mem: false,
layout: "all_cairo",
proof_mode: true,
secure_run: Some(false),
..Default::default()
};
let mut hint_executor = BuiltinHintProcessor::new_empty();

// Execute each program in parallel
let _result = cairo_run_program(&program, &cairo_run_config, &mut hint_executor)
.expect("Couldn't run program");
});
let elapsed = start_time.elapsed();

let programs_len: &usize = &programs_bytes.clone().len();

tracing::info!(%programs_len, ?elapsed, "Finished");
}
Loading