Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(katana): optimisations + benchmark setup #2900

Merged
merged 6 commits into from
Feb 7, 2025

Conversation

remybar
Copy link
Contributor

@remybar remybar commented Jan 13, 2025

Description

Parallelize some block processing and add some benches to monitor the performance of sequential/parallelized commits.

Related issue

#2719

Tests

  • Yes
  • No, because they aren't needed
  • No, because I need help

Added to documentation?

  • README.md
  • Dojo Book
  • No documentation needed

Checklist

  • I've formatted my code (scripts/prettier.sh, scripts/rust_fmt.sh, scripts/cairo_fmt.sh)
  • I've linted my code (scripts/clippy.sh, scripts/docs.sh)
  • I've commented my code
  • I've requested a review after addressing the comments

Summary by CodeRabbit

  • New Features

    • Enhanced system performance with parallel processing capabilities for commit operations.
    • Added performance benchmarking tests to evaluate processing under various modes.
    • Introduced new data structures for improved transaction and receipt handling.
  • Refactor

    • Updated dependency and workspace configurations to integrate performance profiling tools and improve overall efficiency.

@glihm glihm changed the title katana: optimisations + benchmark setup perf(katana): optimisations + benchmark setup Jan 13, 2025
@remybar remybar force-pushed the katana-core-parallelized branch from e2dacee to 5519e6b Compare January 21, 2025 10:57
Copy link
Member

@kariy kariy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This benchmark includes data setup within the measured function (ie Uncommitted::commit). We should move the preparation of the test data (all the arguments for Uncommitted::new) outside the benchmark routine to ensure we're measuring only the commit operation's performance, not including the setup overhead.

We can use the Arbitrary trait to derive random values for the test data.

I'm not sure what's the best block composition but maybe we can start with something like 20 transactions, each with 2 events, and a state update of size 100.

@kariy
Copy link
Member

kariy commented Jan 21, 2025

I think what we should do, is have two .commit methods - the serialized and parallelized versions (ie .commit and .commit_parallel) - run two benchmarks against the same test vectors using each method.

@remybar remybar force-pushed the katana-core-parallelized branch from 5519e6b to 3728e6d Compare January 24, 2025 16:03
@remybar
Copy link
Contributor Author

remybar commented Jan 24, 2025

Hi @kariy !

So, I've kept 2 versions of optimized functions (original + parallel one) to be able to create commit and commit_parallel benches.

I used Arbitrary trait to generate random values (not sure if I used the idiomatic way to do this but ...) and set vector sizes to 20 for txs and receipts and 100 for state update stuff.

Maybe it would be a good idea to have 2 sets of data for these benches:

  • a first one with a tiny block to be sure that parallelizing block processings do not add any overhead,
  • a second one with a quite big block, as you proposed, to check that the performance improvement is worth it.

WDYT ?

@remybar remybar force-pushed the katana-core-parallelized branch 2 times, most recently from e9aaea6 to 347b9b0 Compare January 24, 2025 16:11
Copy link

codecov bot commented Jan 24, 2025

Codecov Report

Attention: Patch coverage is 1.44928% with 68 lines in your changes missing coverage. Please review.

Project coverage is 57.04%. Comparing base (d11a7ec) to head (ad0f6a6).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
crates/katana/core/src/backend/mod.rs 1.51% 65 Missing ⚠️
crates/katana/primitives/src/block.rs 0.00% 1 Missing ⚠️
crates/katana/primitives/src/receipt.rs 0.00% 1 Missing ⚠️
crates/katana/primitives/src/transaction.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2900      +/-   ##
==========================================
- Coverage   57.10%   57.04%   -0.07%     
==========================================
  Files         429      429              
  Lines       56868    56939      +71     
==========================================
+ Hits        32475    32481       +6     
- Misses      24393    24458      +65     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@kariy kariy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work @remybar! It's looking good. Can we also include benchmark for block with only 1 tx and very small state updates - to mimic a small block (eg on instant mining when there's only 1 tx in a block)

crates/katana/core/benches/commit.rs Outdated Show resolved Hide resolved
crates/katana/core/benches/commit.rs Outdated Show resolved Hide resolved
crates/katana/core/benches/commit.rs Outdated Show resolved Hide resolved
@remybar remybar force-pushed the katana-core-parallelized branch from 347b9b0 to 82a0abb Compare February 7, 2025 10:42
@remybar
Copy link
Contributor Author

remybar commented Feb 7, 2025

Awesome work @remybar! It's looking good. Can we also include benchmark for block with only 1 tx and very small state updates - to mimic a small block (eg on instant mining when there's only 1 tx in a block)

Oups, I didn't receive any notification for your message 😕
Sorry for the delay, I've updated this PR with your suggestions + a small block benchmark 👍

@remybar remybar marked this pull request as ready for review February 7, 2025 11:04
Copy link

coderabbitai bot commented Feb 7, 2025

Ohayo sensei!

Walkthrough

This pull request updates dependency management and introduces benchmarking and parallel processing enhancements. It adds and modifies Cargo.toml entries across multiple packages by including the pprof, rayon, and criterion crates, and updates the katana-primitives dependency with new features. A new benchmarking module is added to measure block commit performance, while the backend logic now supports parallel commit methods using rayon for concurrent processing of events and receipts.

Changes

File(s) Change Summary
Cargo.toml (root), crates/katana/.../Cargo.toml, crates/katana/executor/Cargo.toml Added/updated dependencies: pprof (version/features → workspace), rayon (workspace), criterion (dev-dependency), and updated katana-primitives (added "arbitrary" feature). Additionally, a new benchmark entry was introduced in katana/core.
crates/katana/.../benches/commit.rs Added a new benchmarking module including BlockConfig, functions for commit and parallel commit (commit, commit_parallel, build_block, commit_benchmark) as well as random data generator helpers.
crates/katana/.../mod.rs Introduced parallel processing methods: commit_parallel, compute_receipt_commitment_parallel, and compute_event_commitment_parallel, leveraging rayon for concurrent execution.
crates/katana/primitives/src/block.rs Added new struct PartialHeader to encapsulate block header fields with serialization support.
crates/katana/primitives/src/receipt.rs Introduced new struct ReceiptWithTxHash for associating transaction hashes with receipts and added methods for hash computation.
crates/katana/primitives/src/transaction.rs Added new struct TxWithHash for encapsulating transactions with their hashes and implemented conversions from ExecutableTxWithHash.

Sequence Diagram(s)

sequenceDiagram
    participant Bench as Benchmark Suite
    participant Block as UncommittedBlock
    participant Backend as Backend Processing

    Bench->>Block: build_block(config)
    Bench->>Block: commit() / commit_parallel()
    alt Serial Commit
        Block->>Backend: commit()
    else Parallel Commit
        Block->>Backend: commit_parallel()
        Backend->>Backend: compute_event_commitment_parallel()
        Backend->>Backend: compute_receipt_commitment_parallel()
    end
Loading

Possibly related PRs

  • feat: katana config file #2668: The changes in the main PR, which involve adding the pprof dependency to the Cargo.toml file, are related to the retrieved PR, as both involve modifications to the Cargo.toml file and include the pprof dependency, albeit with different configurations.
  • feat(katana): compute block commitments #2609: The changes in the main PR, which involve adding the pprof dependency to Cargo.toml, are related to the retrieved PR as both involve modifications to the Cargo.toml files in the katana project, specifically concerning the addition of dependencies for profiling and performance measurement.

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
crates/katana/core/src/backend/mod.rs (2)

367-367: Ohayo sensei! Clarify this comment or remove if it’s unneeded.
“optimisation 1” can be more descriptive. Consider including why this optimization is important.


484-503: Ohayo sensei! Great use of par_bridge for events.
Spawning parallel tasks like this can handle many events effectively. Just be mindful of overhead if event counts are small.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d11a7ec and 046ecc6.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • Cargo.toml (1 hunks)
  • crates/katana/core/Cargo.toml (3 hunks)
  • crates/katana/core/benches/commit.rs (1 hunks)
  • crates/katana/core/src/backend/mod.rs (5 hunks)
  • crates/katana/executor/Cargo.toml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: ensure-windows
  • GitHub Check: test
🔇 Additional comments (14)
crates/katana/core/src/backend/mod.rs (3)

24-24: Ohayo sensei! Nice addition of the rayon prelude.
This import sets the stage for parallel iterators, which supercharges concurrency.


398-441: Ohayo sensei! Bravo on the parallel commit method.
Using rayon’s scope with carefully separated variables is safe here. Each closure writes to a distinct variable, eliminating data races. If you expect many large blocks, this concurrency should boost throughput.


453-457: Ohayo sensei! Parallel receipt commitment looks good.
Parallelizing over receipts can reduce total computation time significantly.

crates/katana/core/benches/commit.rs (4)

16-24: Ohayo sensei! BlockConfig is a neat approach.
Using a small struct to bundle block parameters keeps things organized and testable.


46-52: Ohayo sensei! Good separation of serial vs. parallel commit.
Explicitly having two methods (commit vs. commit_parallel) makes benchmark comparisons more transparent.


94-128: Ohayo sensei! Great job building random transactions and receipts.
This is a succinct way to generate test data for the benchmarks. Remember to watch out for any potential fuzz issues if the random values might cause edge cases.


130-195: Ohayo sensei! Solid benchmark definitions for small and big blocks.
These distinct setups will help confirm that parallel processing overhead remains minimal for small inputs, while yielding performance gains for large ones.

crates/katana/executor/Cargo.toml (1)

39-40: Ohayo sensei! Transitioning pprof and rayon to workspace dependencies.
This unifies version management, simplifies maintenance, and helps ensure consistency across crates. Nicely done!

crates/katana/core/Cargo.toml (5)

14-14: Ohayo sensei: Enhanced Dependency for katana-primitives
Including the "arbitrary" feature in the katana-primitives dependency is a solid move to support random value generation in benchmarks and tests. Just double-check its compatibility with other components relying on this dependency.


29-29: Ohayo sensei: Introducing Rayon for Parallelism
Adding rayon.workspace = true is a welcome enhancement to enable parallel processing. This change directly supports the new parallel commit implementations and should help meet performance goals.


52-52: Ohayo sensei: Criterion for Benchmarking
The new criterion.workspace = true dev-dependency is an excellent addition for setting up performance benchmarks. It will help you precisely measure the benefits of parallelized processing.


57-57: Ohayo sensei: pprof for Profiling
Integrating pprof.workspace = true as a development tool supports detailed performance profiling and work in tandem with Criterion—and that's pretty neat!


62-65: Ohayo sensei: Benchmark Configuration Added
The new benchmark definition with name = "commit" and harness = false is well configured and aligns with the goal of comparing sequential versus parallelized commits. Ensure that the corresponding benchmark implementation in crates/katana/core/benches/commit.rs correctly exercises both commit methods.

Cargo.toml (1)

261-263: Ohayo sensei: pprof Dependency for Profiling in Root Cargo.toml
The introduction of pprof = { version = "0.13.0", features = [ "criterion", "flamegraph" ] } in the root Cargo.toml reinforces profiling support across the project. Please verify that this version meshes well with Criterion and your overall benchmarking setup.

Copy link
Member

@kariy kariy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the liberty to group the benchmarks based on the block size. Other than that, everything looks good! Thanks

@kariy
Copy link
Member

kariy commented Feb 7, 2025

Screenshot 2025-02-07 at 12 33 37 PM

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
crates/katana/primitives/src/block.rs (1)

45-56: Ohayo sensei! “PartialHeader” is well structured.
This design fosters flexibility in block creation, especially for partial or intermediate states. Keep an eye on consistent usage across the codebase to avoid confusion with “Header.”

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 046ecc6 and ad0f6a6.

📒 Files selected for processing (4)
  • crates/katana/core/benches/commit.rs (1 hunks)
  • crates/katana/primitives/src/block.rs (1 hunks)
  • crates/katana/primitives/src/receipt.rs (1 hunks)
  • crates/katana/primitives/src/transaction.rs (1 hunks)
🔇 Additional comments (8)
crates/katana/core/benches/commit.rs (4)

45-51: Ohayo sensei! Great introduction of parallel commits.
It’s awesome to see parallelization integrated for performance gains. Ensure that internal concurrency usage in "commit_parallel()" handles potential data races or synchronization overhead.

Would you like deeper assistance analyzing concurrency aspects within the broader codebase? I can craft scripts that search for lock usage or concurrency patterns that may need more robust synchronization.


133-159: Ohayo sensei! Consider more descriptive benchmark names.
Currently, "Serial" and "Parallel" are used, but it might be beneficial to mirror the benchmark group’s naming for clarity, e.g., "Commit.Small.Serial" and "Commit.Small.Parallel."

-c.bench_function("Serial", |b| {
+c.bench_function("Commit.Small.Serial", |b| {
-c.bench_function("Parallel", |b| {
+c.bench_function("Commit.Small.Parallel", |b| {

161-185: Ohayo sensei! Mirror naming for consistency on big-block benchmarks too.
Likewise, standardizing names to "Commit.Big.Serial" and "Commit.Big.Parallel" better differentiates each scenario and aligns them with the group label.

-c.bench_function("Serial", |b| {
+c.bench_function("Commit.Big.Serial", |b| {
-c.bench_function("Parallel", |b| {
+c.bench_function("Commit.Big.Parallel", |b| {

192-196: Ohayo sensei! This matches a past suggestion exactly.
The Criterion group configuration with PProf looks great and has already been suggested by Kariy previously. Nicely done!

crates/katana/primitives/src/receipt.rs (2)

180-190: Ohayo sensei! Ingenious extension with “ReceiptWithTxHash.”
Attaching the transaction hash to the receipt makes debugging and tracking easier. Good job applying both Deref and AsRef—this seamlessly wraps underlying receipt logic.


203-246: Ohayo sensei! Double-check revert reason hashing.
The revert reason is hashed using “starknet_keccak,” but ensure no hidden edge cases with unusual characters or empty strings.

crates/katana/primitives/src/transaction.rs (2)

654-664: Ohayo! Clean and well-structured implementation, sensei!

The TxWithHash struct is well-designed with appropriate trait derivations and documentation.


666-676: LGTM! Excellent handling of ownership semantics, sensei!

Both From implementations are clean and efficient:

  • One for owned ExecutableTxWithHash
  • One for referenced ExecutableTxWithHash

This provides flexibility while maintaining performance.

@kariy
Copy link
Member

kariy commented Feb 7, 2025

For now, let's keep using the serial version in the main flow. Assuming most usages of katana right now is using the instant mining mode (which mean there's only gonna be 1 tx each block) computing the commitment in parallel wouldn't yield that much performance gain.

@kariy kariy merged commit 022d71a into dojoengine:main Feb 7, 2025
13 of 15 checks passed
@remybar
Copy link
Contributor Author

remybar commented Feb 7, 2025

I took the liberty to group the benchmarks based on the block size. Other than that, everything looks good! Thanks

Yes, no problem 👍 Thanks 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants