-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Analyze % of time spent on field arithmetic #4501
Merged
Merged
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
60d2a7c
Add analysis scripts
codygunton c86d248
Merge branch 'master' into cg/analyze-field-ops-time
ludamad a828c5b
fix: master
ludamad d1fc557
fix: update to new preset name
ludamad fed3a3c
fix: update to new preset path
ludamad 7375a48
Merge remote-tracking branch 'origin/fix/master' into cg/analyze-fiel…
ludamad0 f22fb99
fix: counts
ludamad0 b0bfdb3
Merge branch 'master' into cg/analyze-field-ops-time
codygunton 4b08f79
Merge branch 'master' into cg/analyze-field-ops-time
codygunton File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
40 changes: 40 additions & 0 deletions
40
barretenberg/cpp/scripts/benchmark_field_ops_percentage.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
#!/usr/bin/env bash | ||
set -eu | ||
|
||
TARGET=${1:-goblin_bench} | ||
FILTER=${2:-./"GoblinFull/1$"} | ||
COMMAND=${2:-./$TARGET} | ||
|
||
BUILD_OP_COUNT_TRACK_DIR=build\-op\-count-track | ||
|
||
# Move above script dir. | ||
cd $(dirname $0)/.. | ||
|
||
# Measure the benchmarks with ops counting | ||
cmake --preset op-count-track | ||
cmake --build --preset op-count-track --target $TARGET | ||
# This can be run multithreaded | ||
cd $BUILD_OP_COUNT_TRACK_DIR | ||
./bin/$TARGET --benchmark_filter=$FILTER\ | ||
--benchmark_out=$TARGET.json\ | ||
--benchmark_out_format=json\ | ||
--benchmark_counters_tabular=true\ | ||
|
||
# If needed, benchmark the basic Fr operations | ||
FIELD_OP_COSTS=field_op_costs.json | ||
if [ ! -f $FIELD_OP_COSTS ]; then | ||
cd ../ | ||
FIELD_OPS_TARGET=fr_straight_bench | ||
cmake --preset clang16 | ||
cmake --build --preset clang16 --target $FIELD_OPS_TARGET | ||
cd build | ||
./bin/$FIELD_OPS_TARGET --benchmark_out=../$BUILD_OP_COUNT_TRACK_DIR/$FIELD_OP_COSTS \ | ||
--benchmark_out_format=json | ||
fi | ||
|
||
# Compute the singly-threaded benchmarks for comparison | ||
cd ../ | ||
./scripts/benchmark_remote.sh goblin_bench "taskset -c 0 ./goblin_bench --benchmark_filter=Full/1$" | ||
|
||
# Analyze the results | ||
python3 ./scripts/compute_field_operations_time.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
import json | ||
from pathlib import Path | ||
|
||
PREFIX = Path("build-op-count-track") | ||
OPS_BENCH = Path("field_op_costs.json") | ||
GOBLIN_BENCH_JSON = Path("goblin_bench.json") | ||
BENCHMARK = "GoblinBench/GoblinFull/1" | ||
|
||
# We will populate time per operation for a subset of the operations | ||
# For accurate counting, we must select operations that do not call other | ||
# operations on the list. | ||
ns_per_op = {} | ||
to_keep = [ | ||
"asm_add_with_coarse_reduction", | ||
"asm_conditional_negate", | ||
"asm_mul_with_coarse_reduction", | ||
# "asm_reduce_once", | ||
"asm_self_add_with_coarse_reduction", | ||
"asm_self_mul_with_coarse_reduction", | ||
"asm_self_reduce_once", | ||
"asm_self_sqr_with_coarse_reduction", | ||
"asm_self_sub_with_coarse_reduction", | ||
"asm_sqr_with_coarse_reduction", | ||
# "mul", | ||
# "self_mul", | ||
# "add", | ||
# "self_add", | ||
# "sub", | ||
# "self_sub", | ||
# "invert", // mostly just self_sqr and *= | ||
# "self_neg", | ||
# "self_reduce_once", | ||
# "self_to_montgomery_form", | ||
# "self_sqr", | ||
# "sqr", | ||
] | ||
|
||
# read the measuremens of the basic field operations | ||
with open(PREFIX/OPS_BENCH, "r") as read_file: | ||
read_result = json.load(read_file) | ||
for bench in read_result["benchmarks"]: | ||
if bench["name"] in to_keep: | ||
ns_per_op[bench["name"]] = bench["real_time"] | ||
|
||
with open(PREFIX/GOBLIN_BENCH_JSON, "r") as read_file: | ||
read_result = json.load(read_file) | ||
for bench in read_result["benchmarks"]: | ||
if bench["name"] == BENCHMARK: | ||
mct = bench | ||
|
||
total_time = 0 | ||
|
||
for (key, time) in ns_per_op.items(): | ||
full_key = "fr::" + key | ||
if (full_key in mct.keys()): | ||
count = int(mct[full_key]) | ||
if (count is not None): | ||
print(f'aggregating { count } counts of {key} at time {ns_per_op[key]} ns.') | ||
total_time += count * ns_per_op[key] | ||
|
||
total_time /= 1e9 | ||
|
||
print(f'Time spent on field ops: {round(total_time, 3)}s.') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
209 changes: 209 additions & 0 deletions
209
barretenberg/cpp/src/barretenberg/ecc/curves/bn254/fr_straight.bench.cpp
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,209 @@ | ||
#include "fr.hpp" | ||
|
||
#include <benchmark/benchmark.h> | ||
|
||
using namespace bb; | ||
using namespace benchmark; | ||
|
||
namespace { | ||
void asm_add_with_coarse_reduction(State& state) noexcept | ||
{ | ||
fr x, y; | ||
for (auto _ : state) { | ||
DoNotOptimize(fr::asm_add_with_coarse_reduction(x, y)); | ||
} | ||
} | ||
BENCHMARK(asm_add_with_coarse_reduction); | ||
|
||
void asm_conditional_negate(State& state) noexcept | ||
{ | ||
fr x; | ||
for (auto _ : state) { | ||
fr::asm_conditional_negate(x, true); | ||
} | ||
} | ||
BENCHMARK(asm_conditional_negate); | ||
|
||
void asm_mul_with_coarse_reduction(State& state) noexcept | ||
{ | ||
fr x, y; | ||
for (auto _ : state) { | ||
DoNotOptimize(fr::asm_mul_with_coarse_reduction(x, y)); | ||
} | ||
} | ||
BENCHMARK(asm_mul_with_coarse_reduction); | ||
|
||
void asm_reduce_once(State& state) noexcept | ||
{ | ||
fr x; | ||
for (auto _ : state) { | ||
DoNotOptimize(fr::asm_reduce_once(x)); | ||
} | ||
} | ||
BENCHMARK(asm_reduce_once); | ||
|
||
void asm_self_add_with_coarse_reduction(State& state) noexcept | ||
{ | ||
fr x, y; | ||
for (auto _ : state) { | ||
fr::asm_self_add_with_coarse_reduction(x, y); | ||
} | ||
} | ||
BENCHMARK(asm_self_add_with_coarse_reduction); | ||
|
||
void asm_self_mul_with_coarse_reduction(State& state) noexcept | ||
{ | ||
fr x, y; | ||
for (auto _ : state) { | ||
fr::asm_self_mul_with_coarse_reduction(x, y); | ||
} | ||
} | ||
BENCHMARK(asm_self_mul_with_coarse_reduction); | ||
|
||
void asm_self_reduce_once(State& state) noexcept | ||
{ | ||
fr x; | ||
for (auto _ : state) { | ||
fr::asm_self_reduce_once(x); | ||
} | ||
} | ||
BENCHMARK(asm_self_reduce_once); | ||
|
||
void asm_self_sqr_with_coarse_reduction(State& state) noexcept | ||
{ | ||
fr x; | ||
for (auto _ : state) { | ||
fr::asm_self_sqr_with_coarse_reduction(x); | ||
} | ||
} | ||
BENCHMARK(asm_self_sqr_with_coarse_reduction); | ||
|
||
void asm_self_sub_with_coarse_reduction(State& state) noexcept | ||
{ | ||
fr x, y; | ||
for (auto _ : state) { | ||
fr::asm_self_sub_with_coarse_reduction(x, y); | ||
} | ||
} | ||
BENCHMARK(asm_self_sub_with_coarse_reduction); | ||
|
||
void asm_sqr_with_coarse_reduction(State& state) noexcept | ||
{ | ||
fr x; | ||
for (auto _ : state) { | ||
DoNotOptimize(fr::asm_sqr_with_coarse_reduction(x)); | ||
} | ||
} | ||
BENCHMARK(asm_sqr_with_coarse_reduction); | ||
|
||
void mul(State& state) noexcept | ||
{ | ||
fr x, y; | ||
for (auto _ : state) { | ||
DoNotOptimize(x * y); | ||
} | ||
} | ||
BENCHMARK(mul); | ||
|
||
void self_mul(State& state) noexcept | ||
{ | ||
fr x, y; | ||
for (auto _ : state) { | ||
x *= y; | ||
} | ||
} | ||
BENCHMARK(self_mul); | ||
|
||
void add(State& state) noexcept | ||
{ | ||
fr x, y; | ||
for (auto _ : state) { | ||
DoNotOptimize(x + y); | ||
} | ||
} | ||
BENCHMARK(add); | ||
|
||
void self_add(State& state) noexcept | ||
{ | ||
fr x, y; | ||
for (auto _ : state) { | ||
x += y; | ||
} | ||
} | ||
BENCHMARK(self_add); | ||
|
||
void sub(State& state) noexcept | ||
{ | ||
fr x, y; | ||
for (auto _ : state) { | ||
DoNotOptimize(x - y); | ||
} | ||
} | ||
BENCHMARK(sub); | ||
|
||
void self_sub(State& state) noexcept | ||
{ | ||
fr x, y; | ||
for (auto _ : state) { | ||
x -= y; | ||
} | ||
} | ||
BENCHMARK(self_sub); | ||
|
||
void invert(State& state) noexcept | ||
{ | ||
fr x; | ||
for (auto _ : state) { | ||
DoNotOptimize(x.invert()); | ||
} | ||
} | ||
BENCHMARK(invert); | ||
|
||
void self_neg(State& state) noexcept | ||
{ | ||
fr x; | ||
for (auto _ : state) { | ||
x.self_neg(); | ||
} | ||
} | ||
BENCHMARK(self_neg); | ||
|
||
void self_reduce_once(State& state) noexcept | ||
{ | ||
fr x; | ||
for (auto _ : state) { | ||
x.self_reduce_once(); | ||
} | ||
} | ||
BENCHMARK(self_reduce_once); | ||
|
||
void self_to_montgomery_form(State& state) noexcept | ||
{ | ||
fr x; | ||
for (auto _ : state) { | ||
x.self_to_montgomery_form(); | ||
} | ||
} | ||
BENCHMARK(self_to_montgomery_form); | ||
|
||
void self_sqr(State& state) noexcept | ||
{ | ||
fr x; | ||
for (auto _ : state) { | ||
x.self_sqr(); | ||
} | ||
} | ||
BENCHMARK(self_sqr); | ||
|
||
void sqr(State& state) noexcept | ||
{ | ||
fr x; | ||
for (auto _ : state) { | ||
DoNotOptimize(x.sqr()); | ||
} | ||
} | ||
BENCHMARK(sqr); | ||
} // namespace | ||
|
||
// NOLINTNEXTLINE macro invokation triggers style guideline errors from googletest code | ||
BENCHMARK_MAIN(); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that if are using the benchmark for time AND op counts, then time will be inflated by op counting. But I did aim it to be fast, should be < 10 cycles
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah nevermind, I guess this is not the op counting build. This makes sense, cool to build on the remote script