perf(katana): optimisations + benchmark setup #2900

remybar · 2025-01-13T14:27:30Z

Description

Parallelize some block processing and add some benches to monitor the performance of sequential/parallelized commits.

Related issue

#2719

Tests

Yes
No, because they aren't needed
No, because I need help

Added to documentation?

README.md
Dojo Book
No documentation needed

Checklist

I've formatted my code (scripts/prettier.sh, scripts/rust_fmt.sh, scripts/cairo_fmt.sh)
I've linted my code (scripts/clippy.sh, scripts/docs.sh)
I've commented my code
I've requested a review after addressing the comments

Summary by CodeRabbit

New Features
- Enhanced system performance with parallel processing capabilities for commit operations.
- Added performance benchmarking tests to evaluate processing under various modes.
Refactor
- Updated dependency and workspace configurations to integrate performance profiling tools and improve overall efficiency.

kariy

This benchmark includes data setup within the measured function (ie Uncommitted::commit). We should move the preparation of the test data (all the arguments for Uncommitted::new) outside the benchmark routine to ensure we're measuring only the commit operation's performance, not including the setup overhead.

We can use the Arbitrary trait to derive random values for the test data.

I'm not sure what's the best block composition but maybe we can start with something like 20 transactions, each with 2 events, and a state update of size 100.

kariy · 2025-01-21T16:06:28Z

I think what we should do, is have two .commit methods - the serialized and parallelized versions (ie .commit and .commit_parallel) - run two benchmarks against the same test vectors using each method.

remybar · 2025-01-24T16:11:14Z

Hi @kariy !

So, I've kept 2 versions of optimized functions (original + parallel one) to be able to create commit and commit_parallel benches.

I used Arbitrary trait to generate random values (not sure if I used the idiomatic way to do this but ...) and set vector sizes to 20 for txs and receipts and 100 for state update stuff.

Maybe it would be a good idea to have 2 sets of data for these benches:

a first one with a tiny block to be sure that parallelizing block processings do not add any overhead,
a second one with a quite big block, as you proposed, to check that the performance improvement is worth it.

WDYT ?

codecov · 2025-01-24T16:35:35Z

Codecov Report

Attention: Patch coverage is 1.51515% with 65 lines in your changes missing coverage. Please review.

Project coverage is 57.04%. Comparing base (d11a7ec) to head (046ecc6).

Files with missing lines	Patch %	Lines
crates/katana/core/src/backend/mod.rs	1.51%	65 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2900      +/-   ##
==========================================
- Coverage   57.10%   57.04%   -0.07%     
==========================================
  Files         429      429              
  Lines       56868    56933      +65     
==========================================
+ Hits        32475    32476       +1     
- Misses      24393    24457      +64

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

kariy

Awesome work @remybar! It's looking good. Can we also include benchmark for block with only 1 tx and very small state updates - to mimic a small block (eg on instant mining when there's only 1 tx in a block)

kariy · 2025-01-24T16:45:22Z

crates/katana/core/benches/commit.rs

+    });
+}
+
+criterion_group!(benches, commit_benchmark);


Suggested change

criterion_group!(benches, commit_benchmark);

criterion_group! {

name = benches;

config = Criterion::default().with_profiler(PProfProfiler::new(100, Output::Flamegraph(None)));

targets = commit_benchmark

}

kariy · 2025-01-24T16:53:05Z

crates/katana/core/benches/commit.rs

+    let block =
+        UncommittedBlock::new(header, transactions, receipts.as_slice(), &state_updates, provider);
+
+    c.bench_function("commit", |b| {


nit:

Suggested change

c.bench_function("commit", |b| {

c.bench_function("Commit.Big.Serial", |b| {

kariy · 2025-01-24T16:53:15Z

crates/katana/core/benches/commit.rs

+        b.iter_batched(|| block.clone(), |input| commit(black_box(input)), BatchSize::SmallInput);
+    });
+
+    c.bench_function("commit_parallel", |b| {


nit:

Suggested change

c.bench_function("commit_parallel", |b| {

c.bench_function("Commit.Big.Parallel", |b| {

remybar · 2025-02-07T10:52:11Z

Awesome work @remybar! It's looking good. Can we also include benchmark for block with only 1 tx and very small state updates - to mimic a small block (eg on instant mining when there's only 1 tx in a block)

Oups, I didn't receive any notification for your message 😕
Sorry for the delay, I've updated this PR with your suggestions + a small block benchmark 👍

coderabbitai · 2025-02-07T11:07:07Z

Ohayo sensei!

Walkthrough

This pull request updates dependency management and introduces benchmarking and parallel processing enhancements. It adds and modifies Cargo.toml entries across multiple packages by including the pprof, rayon, and criterion crates, and updates the katana-primitives dependency with new features. A new benchmarking module is added to measure block commit performance, while the backend logic now supports parallel commit methods using rayon for concurrent processing of events and receipts.

Changes

File(s)	Change Summary
Cargo.toml (root), crates/katana/.../Cargo.toml, crates/katana/executor/Cargo.toml	Added/updated dependencies: pprof (version/features → workspace), rayon (workspace), criterion (dev-dependency), and updated katana-primitives (added "arbitrary" feature). Additionally, a new benchmark entry was introduced in katana/core.
crates/katana/.../benches/commit.rs	Added a new benchmarking module including BlockConfig, functions for commit and parallel commit (commit, commit_parallel, build_block, commit_benchmark) as well as random data generator helpers.
crates/katana/.../mod.rs	Introduced parallel processing methods: commit_parallel, compute_receipt_commitment_parallel, and compute_event_commitment_parallel, leveraging rayon for concurrent execution.

Sequence Diagram(s)

sequenceDiagram
    participant Bench as Benchmark Suite
    participant Block as UncommittedBlock
    participant Backend as Backend Processing

    Bench->>Block: build_block(config)
    Bench->>Block: commit() / commit_parallel()
    alt Serial Commit
        Block->>Backend: commit()
    else Parallel Commit
        Block->>Backend: commit_parallel()
        Backend->>Backend: compute_event_commitment_parallel()
        Backend->>Backend: compute_receipt_commitment_parallel()
    end

Possibly related PRs

chore(katana-primitives): pleasing clippy #2241: Involves Cargo.toml dependency modifications similar to the pprof addition.
chore(katana): remove unused codes #2647: Addresses changes to the pprof dependency in Cargo.toml, complementing the current modifications.
chore: remove unused deps #2489: Contains related changes in dependency management for the pprof crate with a workspace approach.

Suggested reviewers

kariy

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

crates/katana/core/src/backend/mod.rs (2)

367-367: Ohayo sensei! Clarify this comment or remove if it’s unneeded.
“optimisation 1” can be more descriptive. Consider including why this optimization is important.

484-503: Ohayo sensei! Great use of par_bridge for events.
Spawning parallel tasks like this can handle many events effectively. Just be mindful of overhead if event counts are small.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d11a7ec and 046ecc6.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (5)

Cargo.toml (1 hunks)
crates/katana/core/Cargo.toml (3 hunks)
crates/katana/core/benches/commit.rs (1 hunks)
crates/katana/core/src/backend/mod.rs (5 hunks)
crates/katana/executor/Cargo.toml (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (2)

GitHub Check: ensure-windows
GitHub Check: test

🔇 Additional comments (14)

crates/katana/core/src/backend/mod.rs (3)

24-24: Ohayo sensei! Nice addition of the rayon prelude.
This import sets the stage for parallel iterators, which supercharges concurrency.

398-441: Ohayo sensei! Bravo on the parallel commit method.
Using rayon’s scope with carefully separated variables is safe here. Each closure writes to a distinct variable, eliminating data races. If you expect many large blocks, this concurrency should boost throughput.

453-457: Ohayo sensei! Parallel receipt commitment looks good.
Parallelizing over receipts can reduce total computation time significantly.

crates/katana/core/benches/commit.rs (4)

16-24: Ohayo sensei! BlockConfig is a neat approach.
Using a small struct to bundle block parameters keeps things organized and testable.

46-52: Ohayo sensei! Good separation of serial vs. parallel commit.
Explicitly having two methods (commit vs. commit_parallel) makes benchmark comparisons more transparent.

94-128: Ohayo sensei! Great job building random transactions and receipts.
This is a succinct way to generate test data for the benchmarks. Remember to watch out for any potential fuzz issues if the random values might cause edge cases.

130-195: Ohayo sensei! Solid benchmark definitions for small and big blocks.
These distinct setups will help confirm that parallel processing overhead remains minimal for small inputs, while yielding performance gains for large ones.

crates/katana/executor/Cargo.toml (1)

39-40: Ohayo sensei! Transitioning pprof and rayon to workspace dependencies.
This unifies version management, simplifies maintenance, and helps ensure consistency across crates. Nicely done!

crates/katana/core/Cargo.toml (5)

14-14: Ohayo sensei: Enhanced Dependency for katana-primitives
Including the "arbitrary" feature in the katana-primitives dependency is a solid move to support random value generation in benchmarks and tests. Just double-check its compatibility with other components relying on this dependency.

29-29: Ohayo sensei: Introducing Rayon for Parallelism
Adding rayon.workspace = true is a welcome enhancement to enable parallel processing. This change directly supports the new parallel commit implementations and should help meet performance goals.

52-52: Ohayo sensei: Criterion for Benchmarking
The new criterion.workspace = true dev-dependency is an excellent addition for setting up performance benchmarks. It will help you precisely measure the benefits of parallelized processing.

57-57: Ohayo sensei: pprof for Profiling
Integrating pprof.workspace = true as a development tool supports detailed performance profiling and work in tandem with Criterion—and that's pretty neat!

62-65: Ohayo sensei: Benchmark Configuration Added
The new benchmark definition with name = "commit" and harness = false is well configured and aligns with the goal of comparing sequential versus parallelized commits. Ensure that the corresponding benchmark implementation in crates/katana/core/benches/commit.rs correctly exercises both commit methods.

Cargo.toml (1)

261-263: Ohayo sensei: pprof Dependency for Profiling in Root Cargo.toml
The introduction of pprof = { version = "0.13.0", features = [ "criterion", "flamegraph" ] } in the root Cargo.toml reinforces profiling support across the project. Please verify that this version meshes well with Criterion and your overall benchmarking setup.

glihm added the contributor label Jan 13, 2025

glihm changed the title ~~katana: optimisations + benchmark setup~~ perf(katana): optimisations + benchmark setup Jan 13, 2025

remybar force-pushed the katana-core-parallelized branch from e2dacee to 5519e6b Compare January 21, 2025 10:57

kariy reviewed Jan 21, 2025

View reviewed changes

remybar force-pushed the katana-core-parallelized branch from 5519e6b to 3728e6d Compare January 24, 2025 16:03

remybar force-pushed the katana-core-parallelized branch 2 times, most recently from e9aaea6 to 347b9b0 Compare January 24, 2025 16:11

kariy reviewed Jan 24, 2025

View reviewed changes

remybar added 3 commits February 7, 2025 11:36

optimisations + benchmark setup

8f23ffe

after review

253288d

fix duplicates in Cargo.toml

82a0abb

remybar force-pushed the katana-core-parallelized branch from 347b9b0 to 82a0abb Compare February 7, 2025 10:42

fix fmt

046ecc6

remybar marked this pull request as ready for review February 7, 2025 11:04

coderabbitai bot reviewed Feb 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(katana): optimisations + benchmark setup #2900

perf(katana): optimisations + benchmark setup #2900

remybar commented Jan 13, 2025 •

edited by coderabbitai bot

Loading

kariy left a comment •

edited

Loading

kariy commented Jan 21, 2025 •

edited

Loading

remybar commented Jan 24, 2025

codecov bot commented Jan 24, 2025 •

edited

Loading

kariy left a comment

kariy Jan 24, 2025

kariy Jan 24, 2025

kariy Jan 24, 2025

remybar commented Feb 7, 2025

coderabbitai bot commented Feb 7, 2025

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

coderabbitai bot left a comment

-criterion_group!(benches, commit_benchmark);
+criterion_group! {
+    name = benches;
+    config = Criterion::default().with_profiler(PProfProfiler::new(100, Output::Flamegraph(None)));
+    targets = commit_benchmark
+}

	c.bench_function("commit", \|b\| {
	c.bench_function("Commit.Big.Serial", \|b\| {

	c.bench_function("commit_parallel", \|b\| {
	c.bench_function("Commit.Big.Parallel", \|b\| {

perf(katana): optimisations + benchmark setup #2900

Are you sure you want to change the base?

perf(katana): optimisations + benchmark setup #2900

Conversation

remybar commented Jan 13, 2025 • edited by coderabbitai bot Loading

Description

Related issue

Tests

Added to documentation?

Checklist

Summary by CodeRabbit

kariy left a comment • edited Loading

Choose a reason for hiding this comment

kariy commented Jan 21, 2025 • edited Loading

remybar commented Jan 24, 2025

codecov bot commented Jan 24, 2025 • edited Loading

Codecov Report

kariy left a comment

Choose a reason for hiding this comment

kariy Jan 24, 2025

Choose a reason for hiding this comment

kariy Jan 24, 2025

Choose a reason for hiding this comment

kariy Jan 24, 2025

Choose a reason for hiding this comment

remybar commented Feb 7, 2025

coderabbitai bot commented Feb 7, 2025

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

remybar commented Jan 13, 2025 •

edited by coderabbitai bot

Loading

kariy left a comment •

edited

Loading

kariy commented Jan 21, 2025 •

edited

Loading

codecov bot commented Jan 24, 2025 •

edited

Loading