diff --git a/docs/docs/icicle/primitives/merkle.md b/docs/docs/icicle/primitives/merkle.md index 62afd87854..de376bb1f9 100644 --- a/docs/docs/icicle/primitives/merkle.md +++ b/docs/docs/icicle/primitives/merkle.md @@ -7,17 +7,17 @@ A **Merkle tree** is a cryptographic data structure that allows for **efficient - **Leaf nodes**, each containing a piece of data. - **Internal nodes**, which store the **hashes of their child nodes**, leading up to the **root node** (the cryptographic commitment). -With ICICLE, you have the **flexibility** to build various tree topologies based on your needs. The user must define: - -1. **Hasher per layer** ([Link to Hasher API](./hash.md)) with a **default input size**. -2. **Size of a leaf element** (in bytes): This defines the **granularity** of the data used for opening proofs. - --- ## Tree Structure and Configuration ### Structure Definition +With ICICLE, you have the **flexibility** to build various tree topologies based on your needs. A tree is defined by: + +1. **Hasher per layer** ([Link to Hasher API](./hash.md)) with a **default input size**. +2. **Size of a leaf element** (in bytes): This defines the **granularity** of the data used for opening proofs. + The **root node** is assumed to be a single node. The **height of the tree** is determined by the **number of layers**. Each layer's **arity** is calculated as: @@ -31,6 +31,11 @@ $$ {arity}_0 = \frac{layers[0].inputSize}{leafSize} $$ +:::note +Each layer has a shrinking-factor defined by $\frac{layer.outputSize}{layer.inputSize}$. +This factor is used to compute the input size, assuming a single root node. +::: + --- ### Defining a Merkle Tree @@ -47,6 +52,10 @@ static MerkleTree create( ### Building the Tree +The Merkle tree can be constructed from input data of any type, allowing flexibility in its usage. The size of the input must align with the tree structure defined by the hash layers and leaf size. If the input size does not match the expected size, padding may be applied. + +Refer to the Padding Section for more details on how mismatched input sizes are handled. + ```cpp // icicle/merkle/merkle_tree.h inline eIcicleError build( @@ -77,12 +86,12 @@ const uint64_t leaf_size = 1024; const uint32_t max_input_size = leaf_size * 16; auto input = std::make_unique(max_input_size / sizeof(uint64_t)); -// Define hasher -auto layer0_hasher = Keccak256::create(leaf_size); // hash 1KB -> 32B -auto next_layer_hasher = Keccak256::create(2 * layer0_hasher.output_size()); // hash every 64B to 32B +// Define hashes +auto hash = Keccak256::create(leaf_size); // hash 1KB -> 32B +auto compress = Keccak256::create(2 * hasher.output_size()); // hash every 64B to 32B // Construct the tree using the layer hashes and leaf-size -std::vector hashes = {layer0_hasher, next_layer_hasher, next_layer_hasher, next_layer_hasher, next_layer_hasher}; +std::vector hashes = {hasher, compress, compress, compress, compress}; auto merkle_tree = MerkleTree::create(hashes, leaf_size); // compute the tree @@ -105,10 +114,10 @@ const uint32_t max_input_size = leaf_size * 16; auto input = std::make_unique(max_input_size / sizeof(uint64_t)); // note here we use Blake2S for the upper layer -auto layer0_hasher = Keccak256::create(leaf_size); -auto next_layer_hasher = Blake2s::create(4 * layer0_hasher.output_size()); +auto hash = Keccak256::create(leaf_size); +auto compress = Blake2s::create(4 * hash.output_size()); -std::vector hashes = {layer0_hasher, next_layer_hasher}; +std::vector hashes = {hash, compress, compress}; auto merkle_tree = MerkleTree::create(hashes, leaf_size); merkle_tree.build(input.get(), max_input_size / sizeof(uint64_t), default_merkle_tree_config()); @@ -142,6 +151,8 @@ merkle_tree.build(input.get(), max_input_size / sizeof(uint64_t), config); ## Root as Commitment +Retrieve the Merkle-root and serialize. + ```cpp /** * @brief Returns a pair containing the pointer to the root (ON HOST) data and its size. @@ -150,16 +161,27 @@ merkle_tree.build(input.get(), max_input_size / sizeof(uint64_t), config); inline std::pair get_merkle_root() const; auto [commitment, size] = merkle_tree.get_merkle_root(); +serialize_commitment_application_code(...); ``` :::note The commitment can be serialized to the proof. ::: + --- ## Generating Merkle Proofs -### Definition +Merkle proofs are used to **prove the integrity of opened leaves** in a Merkle tree. A proof ensures that a specific leaf belongs to the committed data by enabling the verifier to reconstruct the **root hash (commitment)**. + +A Merkle proof contains: + +- **Leaf**: The data being verified. +- **Index** (leaf_idx): The position of the leaf in the original dataset. +- **Path**: A sequence of sibling hashes (tree nodes) needed to recompute the path from the leaf to the root. + +![Merkle Pruned Phat Diagram](./merkle_diagrams/diagram1_path.png) + ```cpp // icicle/merkle/merkle_proof.h @@ -198,36 +220,32 @@ The Merkle-path can be serialized to the proof along the leaf. * @param merkle_proof The MerkleProof object includes the leaf, path, and the root. * @param valid output valid bit. True if the proof is valid, false otherwise. */ -eIcicleError verify(const MerkleProof& merkle_proof, bool& valid) const; +eIcicleError verify(const MerkleProof& merkle_proof, bool& valid) const +``` +### Example: Verifying a Proof + +```cpp bool valid = false; auto err = merkle_tree.verify(proof, valid); ``` --- -## Pruned vs. Full Paths +## Pruned vs. Full Merkle-paths A **Merkle path** is a collection of **sibling hashes** that allows the verifier to **reconstruct the root hash** from a specific leaf. This enables anyone with the **path and root** to verify that the **leaf** belongs to the committed dataset. There are two types of paths that can be computed: -- **Pruned Path:** Contains only necessary sibling hashes. - -![Merkle Pruned Phat Diagram](./merkle_diagrams/diagram1_path.png) - -```cpp -MerkleProof proof{}; -auto err = merkle_tree.get_merkle_proof( - input.get(), - max_input_size / sizeof(uint64_t), - 3 /*leaf-idx*/, true /*=pruned*/, // --> note the pruned flag here - default_merkle_tree_config(), proof); -``` +- [**Pruned Path:**](#generating-merkle-proofs) Contains only necessary sibling hashes. - **Full Path:** Contains all sibling nodes and intermediate hashes. + ![Merkle Full Path Diagram](./merkle_diagrams/diagram1_path_full.png) +To compute a full path, specify `pruned=false`: + ```cpp MerkleProof proof{}; auto err = merkle_tree.get_merkle_proof( diff --git a/docs/docs/icicle/primitives/merkle_diagrams/diagram1.gv b/docs/docs/icicle/primitives/merkle_diagrams/diagram1.gv index 3e5d54686b..f16e419ff7 100644 --- a/docs/docs/icicle/primitives/merkle_diagrams/diagram1.gv +++ b/docs/docs/icicle/primitives/merkle_diagrams/diagram1.gv @@ -3,17 +3,17 @@ digraph MerkleTree { node [shape = circle; style = filled; color = lightblue; fontname = "Helvetica"; fontsize = 10;]; // Root node - Root [label = "Root\n (Commitment)";]; + Root [label = "Root\n (Commitment) | Keccak256";]; - L1_0 [label = "";]; + L1_0 [label = "Keccak256";]; L1_1 [label = "";]; - L2_0 [label = "";]; + L2_0 [label = "Keccak256";]; L2_1 [label = "";]; L2_2 [label = "";]; L2_3 [label = "";]; - L3_0 [label = "";]; + L3_0 [label = "Keccak256";]; L3_1 [label = "";]; L3_2 [label = "";]; L3_3 [label = "";]; @@ -22,7 +22,7 @@ digraph MerkleTree { L3_6 [label = "";]; L3_7 [label = "";]; - L4_0 [label = "";]; + L4_0 [label = "Keccak256";]; L4_1 [label = "";]; L4_2 [label = "";]; L4_3 [label = "";]; diff --git a/docs/docs/icicle/primitives/merkle_diagrams/diagram1.png b/docs/docs/icicle/primitives/merkle_diagrams/diagram1.png index 75976f0aac..7e20187bb9 100644 Binary files a/docs/docs/icicle/primitives/merkle_diagrams/diagram1.png and b/docs/docs/icicle/primitives/merkle_diagrams/diagram1.png differ diff --git a/docs/docs/icicle/primitives/merkle_diagrams/diagram2.gv b/docs/docs/icicle/primitives/merkle_diagrams/diagram2.gv index 3e67e370a6..2ffd9d18c3 100644 --- a/docs/docs/icicle/primitives/merkle_diagrams/diagram2.gv +++ b/docs/docs/icicle/primitives/merkle_diagrams/diagram2.gv @@ -3,13 +3,30 @@ digraph MerkleTree { node [shape = circle; style = filled; color = lightblue; fontname = "Helvetica"; fontsize = 10;]; // Root node - Root [label = "Root\n (Commitment)";]; + Root [label = "Root\n (Commitment) | Blake2s";]; - L1_0 [label = "";]; + L1_0 [label = "Blake2s";]; L1_1 [label = "";]; L1_2 [label = "";]; L1_3 [label = "";]; + L2_0 [label = "Keccak256";]; + L2_1 [label = "";]; + L2_2 [label = "";]; + L2_3 [label = "";]; + L2_4 [label = "";]; + L2_5 [label = "";]; + L2_6 [label = "";]; + L2_7 [label = "";]; + L2_8 [label = "";]; + L2_9 [label = "";]; + L2_10 [label = "";]; + L2_11 [label = "";]; + L2_12 [label = "";]; + L2_13 [label = "";]; + L2_14 [label = "";]; + L2_15 [label = "";]; + node [style = filled; fillcolor = lightgreen; shape = rect;]; Leaf_0 [label = "Leaf-0";]; Leaf_1 [label = "Leaf-1";]; @@ -35,24 +52,38 @@ digraph MerkleTree { Root -> L1_2; Root -> L1_3; + L1_0 -> L2_0; + L1_0 -> L2_1; + L1_0 -> L2_2; + L1_0 -> L2_3; + L1_1 -> L2_4; + L1_1 -> L2_5; + L1_1 -> L2_6; + L1_1 -> L2_7; + L1_2 -> L2_8; + L1_2 -> L2_9; + L1_2 -> L2_10; + L1_2 -> L2_11; + L1_3 -> L2_12; + L1_3 -> L2_13; + L1_3 -> L2_14; + L1_3 -> L2_15; + // Connections - L1_0 -> Leaf_0; - L1_0 -> Leaf_1; - L1_0 -> Leaf_2; - L1_0 -> Leaf_3; - - L1_1 -> Leaf_4; - L1_1 -> Leaf_5; - L1_1 -> Leaf_6; - L1_1 -> Leaf_7; - - L1_2 -> Leaf_8; - L1_2 -> Leaf_9; - L1_2 -> Leaf_10; - L1_2 -> Leaf_11; - - L1_3 -> Leaf_12; - L1_3 -> Leaf_13; - L1_3 -> Leaf_14; - L1_3 -> Leaf_15; + L2_0 -> Leaf_0; + L2_1 -> Leaf_1; + L2_2 -> Leaf_2; + L2_3 -> Leaf_3; + L2_4 -> Leaf_4; + L2_5 -> Leaf_5; + L2_6 -> Leaf_6; + L2_7 -> Leaf_7; + L2_8 -> Leaf_8; + L2_9 -> Leaf_9; + L2_10 -> Leaf_10; + L2_11 -> Leaf_11; + L2_12 -> Leaf_12; + L2_13 -> Leaf_13; + L2_14 -> Leaf_14; + L2_15 -> Leaf_15; } \ No newline at end of file diff --git a/docs/docs/icicle/primitives/merkle_diagrams/diagram2.png b/docs/docs/icicle/primitives/merkle_diagrams/diagram2.png index 437cdada0b..53eb3a9104 100644 Binary files a/docs/docs/icicle/primitives/merkle_diagrams/diagram2.png and b/docs/docs/icicle/primitives/merkle_diagrams/diagram2.png differ diff --git a/docs/docs/icicle/rust-bindings/merkle.md b/docs/docs/icicle/rust-bindings/merkle.md index f87f5c14cb..2de2595cd0 100644 --- a/docs/docs/icicle/rust-bindings/merkle.md +++ b/docs/docs/icicle/rust-bindings/merkle.md @@ -1 +1,306 @@ -# TODO \ No newline at end of file + +# Merkle Tree API Documentation (Rust) + +This is the Rust version of the **Merkle Tree API Documentation** ([C++ documentation](../primitives/merkle.md)). It mirrors the structure and functionality of the C++ version, providing equivalent APIs in Rust. + +--- + +## What is a Merkle Tree? + +A **Merkle tree** is a cryptographic data structure that allows for **efficient verification of data integrity**. It consists of: +- **Leaf nodes**, each containing a piece of data. +- **Internal nodes**, which store the **hashes of their child nodes**, leading up to the **root node** (the cryptographic commitment). + +--- + +## Tree Structure and Configuration in Rust + +### Defining a Merkle Tree + +With ICICLE, you have the **flexibility** to build various tree topologies based on your needs. A tree is defined by: + +1. **Hasher per layer** ([Link to Hasher API](../rust-bindings/hash.md)) with a **default input size**. +2. **Size of a leaf element** (in bytes): This defines the **granularity** of the data used for opening proofs. + +The **root node** is assumed to be a single node. The **height of the tree** is determined by the **number of layers**. +Each layer's **arity** is calculated as: + +$$ +{arity}_i = \frac{layers[i].inputSize}{layer[i-1].outputSize} +$$ + +For **layer 0**: + +$$ +{arity}_0 = \frac{layers[0].inputSize}{leafSize} +$$ + +:::note +Each layer has a shrinking-factor defined by $\frac{layer.outputSize}{layer.inputSize}$. +This factor is used to compute the input size, assuming a single root node. +::: + + +```rust +struct MerkleTree{ + /// * `layer_hashes` - A vector of hash objects representing the hashes of each layer. + /// * `leaf_element_size` - Size of each leaf element. + /// * `output_store_min_layer` - Minimum layer at which the output is stored. + /// + /// # Returns a new `MerkleTree` instance or eIcicleError. + pub fn new( + layer_hashes: &[&Hasher], + leaf_element_size: u64, + output_store_min_layer: u64, + ) -> Result; +} +``` + +--- + +### Building the Tree + +The Merkle tree can be constructed from input data of any type, allowing flexibility in its usage. The size of the input must align with the tree structure defined by the hash layers and leaf size. If the input size does not match the expected size, padding may be applied. + +Refer to the Padding Section for more details on how mismatched input sizes are handled. + +```rust +struct MerkleTree{ + /// * `leaves` - A slice of leaves (input data). + /// * `config` - Configuration for the Merkle tree. + /// + /// # Returns a result indicating success or failure. + pub fn build( + &self, + leaves: &(impl HostOrDeviceSlice + ?Sized), + cfg: &MerkleTreeConfig, + ) -> Result<(), eIcicleError>; +} +``` + +--- + +## Tree Examples in Rust + +### Example A: Binary Tree + +A binary tree with **5 layers**, using **Keccak-256**: + +![Merkle Tree Diagram](../primitives/merkle_diagrams/diagram1.png) + +```rust +use icicle_core::{ + hash::{HashConfig, Hasher}, + merkle::{MerkleTree, MerkleTreeConfig}, +}; +use icicle_hash::keccak::Keccak256; +use icicle_runtime::memory::HostSlice; + +let leaf_size = 1024_u64; +let max_input_size = leaf_size as usize * 16; +let input: Vec = vec![0; max_input_size]; + +// define layer hashes +// we want one hash layer to hash every 1KB to 32B then compress every 64B so 4 more binary layers +let hash = Keccak256::new(leaf_size).unwrap(); +let compress = Keccak256::new(2 * hash.output_size()).unwrap(); +let _layer_hashes = vec![&hash, &compress, &compress, &compress, &compress]; +// or like that +let layer_hashes: Vec<&Hasher> = std::iter::once(&hash) + .chain(std::iter::repeat(&compress).take(4)) + .collect(); + +let merkle_tree = MerkleTree::new(&layer_hashes, leaf_size, 0 /*min layer to store */).unwrap(); + +// compute the tree +merkle_tree + .build(HostSlice::from_slice(&input), &MerkleTreeConfig::default()) + .unwrap(); +``` + +--- + +### Example B: Tree with Arity 4 + +![Merkle Tree Diagram](../primitives/merkle_diagrams/diagram2.png) + +This example uses **Blake2s** in upper layers: + +```rust +use icicle_hash::blake2s::Blake2s; + +// define layer hashes +// we want one hash layer to hash every 1KB to 32B then compress every 128B so only 2 more layers +let hash = Keccak256::new(leaf_size).unwrap(); +let compress = Blake2s::new(4 * hash.output_size()).unwrap(); +let layer_hashes = vec![&hash, &compress, &compress]; + +let merkle_tree = MerkleTree::new(&layer_hashes, leaf_size, 0 /*min layer to store */).unwrap(); + +merkle_tree + .build(HostSlice::from_slice(&input), &MerkleTreeConfig::default()) + .unwrap(); +``` + +--- + +## Padding + +:::note +This feature is not yet supported in **v3.1** and will be available in **v3.2**. +::: + +When the input for **layer 0** is smaller than expected, ICICLE can apply **padding** to align the data. + +**Padding Schemes:** +1. **Zero padding:** Adds zeroes to the remaining space. +2. **Repeat last leaf:** The final leaf element is repeated to fill the remaining space. + +```rust +// pub enum PaddingPolicy { +// None, // No padding, assume input is correctly sized. +// ZeroPadding, // Pad the input with zeroes to fit the expected input size. +// LastValue, // Pad the input by repeating the last value. +// } + +use icicle_core::merkle::PaddingPolicy; +let mut config = MerkleTreeConfig::default(); +config.padding_policy = PaddingPolicy::ZeroPadding; +merkle_tree + .build(HostSlice::from_slice(&input), &config) + .unwrap(); +``` + +--- + +## Root as Commitment + +Retrieve the Merkle-root and serialize. + +```rust +struct MerkleTree{ + /// Retrieve the root of the Merkle tree. + /// + /// # Returns + /// A reference to the root hash. + pub fn get_root(&self) -> Result<&[T], eIcicleError>; +} + +let commitment: &[u8] = merkle_tree + .get_root() + .unwrap(); +println!("Commitment: {:?}", commitment);**** +``` + +--- + +## Generating Merkle Proofs + +Merkle proofs are used to **prove the integrity of opened leaves** in a Merkle tree. A proof ensures that a specific leaf belongs to the committed data by enabling the verifier to reconstruct the **root hash (commitment)**. + +A Merkle proof contains: + +- **Leaf**: The data being verified. +- **Index** (leaf_idx): The position of the leaf in the original dataset. +- **Path**: A sequence of sibling hashes (tree nodes) needed to recompute the path from the leaf to the root. + +![Merkle Pruned Phat Diagram](../primitives/merkle_diagrams/diagram1_path.png) + +```rust +struct MerkleTree{ + /// * `leaves` - A slice of leaves (input data). + /// * `leaf_idx` - Index of the leaf to generate a proof for. + /// * `pruned_path` - Whether the proof should be pruned. + /// * `config` - Configuration for the Merkle tree. + /// + /// # Returns a `MerkleProof` object or eIcicleError + pub fn get_proof( + &self, + leaves: &(impl HostOrDeviceSlice + ?Sized), + leaf_idx: u64, + pruned_path: bool, + config: &MerkleTreeConfig, + ) -> Result; +} +``` + +### Example: Generating a Proof + +Generating a proof for leaf idx 5: + +```rust +let merkle_proof = merkle_tree + .get_proof( + HostSlice::from_slice(&input), + 5, /*=leaf-idx*/ + true, /*pruned*/ + &MerkleTreeConfig::default(), + ) + .unwrap(); +``` + +:::note +The Merkle-path can be serialized to the proof along the leaf. +::: + +--- + +## Verifying Merkle Proofs + +```rust +struct MerkleTree{ + /// * `proof` - The Merkle proof to verify. + /// + /// # Returns a result indicating whether the proof is valid. + pub fn verify(&self, proof: &MerkleProof) -> Result; +} +``` + +### Example: Verifying a Proof + +```rust +let valid = merkle_tree + .verify(&merkle_proof) + .unwrap(); +assert!(valid); +``` + +--- + +## Pruned vs. Full Merkle-paths + +A **Merkle path** is a collection of **sibling hashes** that allows the verifier to **reconstruct the root hash** from a specific leaf. +This enables anyone with the **path and root** to verify that the **leaf** belongs to the committed dataset. +There are two types of paths that can be computed: + +- [**Pruned Path:**](#generating-merkle-proofs) Contains only necessary sibling hashes. +- **Full Path:** Contains all sibling nodes and intermediate hashes. + +![Merkle Full Path Diagram](../primitives//merkle_diagrams/diagram1_path_full.png) + +To compute a full path, specify `pruned=false`: + +```rust +let merkle_proof = merkle_tree + .get_proof( + HostSlice::from_slice(&input), + 5, /*=leaf-idx*/ + false, /*non-pruned is a full path --> note the pruned flag here*/ + &MerkleTreeConfig::default(), + ) + .unwrap(); +``` + +--- + +## Handling Partial Tree Storage + +In cases where the **Merkle tree is large**, only the **top layers** may be stored to conserve memory. +When opening leaves, the **first layers** (closest to the leaves) are **recomputed dynamically**. + +For example to avoid storing first layer we can define a tree as follows: + + +```rust +let mut merkle_tree = MerkleTree::new(&layer_hashes, leaf_size, 1 /*min layer to store*/); +```