🖋️Merkle Proof Solidity Implementation

Book an audit with Zokyo

Blockchain developers often come across use cases where they need to confirm the presence of a particular transaction in a block, without having to handle the entire block's data. A classic example is within decentralized finance (DeFi) protocols, cross-chain bridges, Layer 2 solutions, and other DApps that require interaction with multiple blockchains or large datasets. It is in these scenarios that the use of Merkle proofs becomes vital.

Merkle proofs allow you to verify the existence of a specific transaction (leaf node) in a Merkle tree, without needing to know every transaction in the block. This property comes handy when you need to keep track of state across multiple chains, or when you need to scale your DApp to handle a larger number of transactions without increased cost or slower performance.

In this article, we'll take a closer look at how to implement a function for verifying Merkle proofs using the Solidity programming language. Below is the Solidity function that verifies a Merkle proof:

function verify(
    bytes32[] proof,
    bytes32 root,
    bytes32 leaf
  )
    internal
    pure
    returns (bool)
  {
    bytes32 computedHash = leaf;

    for (uint256 i = 0; i < proof.length; i++) {
      bytes32 proofElement = proof[i];

      if (computedHash < proofElement) {
        computedHash = keccak256(abi.encodePacked(computedHash, proofElement));
      } else {
        computedHash = keccak256(abi.encodePacked(proofElement, computedHash));
      }
    }

    return computedHash == root;
  }
}

The function, verify, accepts three parameters:

function verify(
    bytes32[] proof,
    bytes32 root,
    bytes32 leaf
  )

proof: An array of type bytes32, representing the sibling hashes on the branch from the leaf to the root of the Merkle tree.
root: A bytes32 hash representing the Merkle root of the tree.
leaf: A bytes32 hash representing the leaf node (the specific transaction) we're verifying.

The function's visibility is internal, meaning it can only be called from within the current contract or contracts that derive from it. The pure keyword indicates that the function will not alter any state variables and will not consume any gas apart from the execution cost. The function returns a boolean value, signifying the success or failure of the verification process.

The function operates by initializing a bytes32 variable, computedHash, to the value of the provided leaf. It then iterates through the provided proof, which is an array of hashes that constitute the Merkle proof.

In the context of our earlier example, we're aiming to validate that 'C' is indeed a legitimate leaf in the Merkle tree.

To do this, we provide a proof in the form of an array of type bytes32[] proof. This 'proof' array includes the sibling hashes of the path from the leaf to the root. In the case of verifying 'C', the array would be populated in the order of Hash(D), Hash(AB), and Hash(EFGH).

bytes32 leaf refers to the hash value of the specific leaf we are verifying, in this case, the hash of transaction 'C'.

Finally, bytes32 root is a representation of the root of the Merkle tree. It's the hash value that we'll use to compare against the result of our computed hash in order to validate the proof. In the context of our previous example, root would be the Hash(ABCDEFGH). Our verification function will compare this root hash with the final computed hash resulting from the proof verification process. If they match, the function will return true, indicating a successful Merkle proof verification.

bytes32 computedHash = leaf;

For each proofElement in the proof, the function computes a new hash that combines computedHash and the current proofElement. It takes into account the order of computedHash and proofElement in the tree when computing this new hash.

The Ethereum network, like many others, use the cryptographic hash function keccak256 (SHA-3 variant), and abi.encodePacked is used to concatenate computedHash and proofElement together. In other words, the function is essentially reconstructing the path from the leaf node up to the root of the Merkle tree.

if (computedHash < proofElement) {
  computedHash = keccak256(abi.encodePacked(computedHash, proofElement));
} else {
  computedHash = keccak256(abi.encodePacked(proofElement, computedHash));
}

Is crucial to maintaining the integrity of the original Merkle tree's structure. When the Merkle tree was first constructed, the sibling nodes were ordered based on their hash values. To ensure the verification process aligns with the original tree structure, the computation of new hashes needs to follow the same order.

That's why the comparison if (computedHash < proofElement) is performed. The order of the two values in the pairing matters when hashing because keccak256(abi.encodePacked(computedHash, proofElement)) will yield a different result from keccak256(abi.encodePacked(proofElement, computedHash)).

Here's a breakdown:

computedHash < proofElement: If the computed hash is less than the current proof element, the function will concatenate the computed hash and the proof element in that order before performing the hash function.
computedHash >= proofElement: If the computed hash is greater than or equal to the current proof element, the function will concatenate the proof element and the computed hash in that order before performing the hash function.

This ordering is essential because it ensures that the Merkle proof follows the same path as when the original Merkle tree was constructed. Without adhering to the correct order, the final computed root would not match the original Merkle root, and the proof would fail even if the transaction were valid.

The keccak256 function is a cryptographic hash function used by the Ethereum network, and abi.encodePacked is a function to concatenate two byte strings.

Note that concatenation is the process of appending one string to the end of another string. When you concatenate strings, you essentially link them together end-to-end to create a new, combined string. It's a fundamental operation in most programming languages for manipulating strings. abi.encodePacked is concatenating computedHash and proofElement into a single, contiguous byte array. This new byte array is then passed to the keccak256 function. For instance, if computedHash were "123" and proofElement were "456", abi.encodePacked(computedHash, proofElement) would result in "123456". So in summary this code is essentially reconstructing the path from the leaf node to the root of the Merkle tree, making sure to maintain the correct order of siblings in each pairing along the path. By keeping the order, the function ensures that it computes the same root hash as in the original Merkle tree, allowing for a valid comparison and successful proof verification.

Once the loop completes, computedHash should be equal to the root if the proof is valid. The function checks this by returning the result of computedHash == root. This will be true if the proof is valid, and false otherwise.

return computedHash == root;

It's worth noting that the given Solidity function verify is a pure, self-contained function, which means it doesn't read or write data from the blockchain's state. It simply computes a value based on the inputs it receives, and then returns that value. This is clear from the pure keyword in the function declaration.

function verify(
    bytes32[] proof,
    bytes32 root,
    bytes32 leaf
  )
    internal
    pure
    returns (bool)

This verify function returns a boolean value: true if the computed hash matches the provided root (meaning the proof is valid), and false otherwise. However, this returned value in itself doesn't have an effect on the blockchain state or any outside variables.

This means that it's the responsibility of the calling function, or the decentralized application (DApp) that utilizes this contract, to handle the inputs to this function and do something meaningful with the returned value.

For instance, a DApp might use this function as part of a larger operation like validating a transaction or verifying the integrity of a data set. If the verify function returns true, the DApp could proceed with the operation; if it returns false, the DApp could halt the operation and alert the user that the proof verification failed.

Ultimately, how this verify function is used—and how its return value is handled—depends on the specific requirements and logic of the DApp or smart contract that's using it.

PreviousMerkle Proofs Within Smart Contracts NextVulnerabilities When Using Merkle Trees

Last updated 1 year ago