Zokyo Auditing Tutorials
  • 🔐Zokyo Auditing Tutorials
  • 📚Tutorials
    • 🏃Tutorial 1: Front-Running
      • 🚀Prerequisites
      • 📘Understanding Front-Running
      • 👓Examples
      • ⚒️Mitigation Steps
      • 🏦Resource Bank to more front running examples
      • 🤝Front-Running Conclusion
    • 🧱Tutorial 2: Unsafe Casting
      • 🚀Prerequisites
      • 📘Understanding Casting
      • 👓Examples
      • 🤝Unsafe Casting Conclusion
    • 👍Tutorial 3: Approvals and Safe Approvals
      • 🚀Prerequisites
      • 📘Understanding Approvals
      • 👓Vulnerability Examples
        • 🔁ERC20 Approval Reset Requirement
        • 😴Ignoring Return Values from ERC20 approve() Function: Potential Miscount of Successful Approvals
        • 🚫Improper use of Open Zeppelins safeApprove() for Non-zero Allowance Increments
        • 🥾Omitted Approval for Contract Interactions Within a Protocol
        • 🤦‍♂️Failing to Reset Token Approvals in Case of Failed Transactions or other actions
        • 💭Miscellaneous
        • ERC20 Approve Race Condition Vulnerability
      • ⚒️Spot the Vulnerability
      • 🤝Approvals and Safe Approvals Conclusion
    • ⛓️Tutorial 4: Block.chainid, DOMAIN_SEPARATOR and EIP-2612 permit
      • 🚀Prerequisites
      • 📘Understanding Block.chainid and DOMAIN_SEPARATOR
      • 👓Examples
      • ⚒️General Mitigation Steps
      • 🤝Tutorial 4 Conclusion
  • 💰Tutorial 5: Fee-On-Transfer Tokens
    • 🚀Prerequisites
    • 📘Understanding Fee-On-Transfer
    • 👓Examples
    • 📘Links to more fee-on-transfer vulnerability examples
    • 🤝Fee-On-Transfer Tokens: Conclusion
  • 🌴Tutorial 6: Merkle Trees
    • 🚀Prerequisites
    • 📘Understanding Merkle Trees
    • 🔎Verification within a Merkle Tree:
    • 📜Merkle Proofs Within Smart Contracts
    • 🖋️Merkle Proof Solidity Implementation
    • 🛑Vulnerabilities When Using Merkle Trees
    • 💀Example Vulnerabilities
    • 🧠Exercise
    • 🤝Merkle Trees Conclusion
  • 🌳Tutorial 7: Merkle-Patricia Trees
    • 🚀Prerequisites
    • 📘Understanding Merkle-Patricia Trees
    • 📕Understanding Merkle-Patrica Trees pt.2
    • 🔎Verification within a Merkle-Patricia Tree
    • 🛑Vulnerabilities When Using Merkle-Patricia Trees
    • 💀Example Vulnerability
    • 🤝Merkle-Patricia Trees: Conclusion
  • 🔁Tutorial 8: Reentrancy
    • 🚀Prerequisites
    • 📘Understanding Reentrancy
    • ⚒️Mitigation
    • 💀The DAO Hack: An In-depth Examination
    • 👓Examples
    • 🏦Resource Bank To More Reentrancy Examples
    • 🤝Conclusion: Reflecting on the Reentrancy Vulnerability
  • 🔂Tutorial 9: Read-Only Reentrancy
    • 🚀Prerequisites
    • 📘Understanding Read-Only Reentrancy
    • 🔨Mitigating Read-Only Reentrancy
    • 👓Real World Examples
    • 🏦Resource Bank To More Reentrancy Examples
    • 🤝Read-Only Reentrancy: Conclusion
  • 🚆Tutorial 10: ERC20 transfer() and safeTransfer()
    • 🚀Prerequisites
    • 📘Understanding ERC20 transfer() and safeTransfer()
    • 👓Examples
    • 🤝Conclusion
  • 📞Tutorial 11: Low level .call(), .transfer() and .send()
    • 🚀Prerequisites
    • 📘Understanding .call, .transfer, and .send
    • 🛑Understanding the Vulnerabilities of .transfer and .send
    • 👓Examples
    • 🤝Low level .call(), .transfer() and .send() conclusion
  • ☎️Tutorial 12: Delegatecall Vulnerabilities in Precompiled Contracts
    • 🚀Prerequisites
    • 📳Understanding Delegatecall
    • ⛰️EVM, L2s, Bridges, and the Quest for Scalability
    • 🏗️Understanding Precompiles in the Ethereum Virtual Machine (EVM)
    • 💻Custom Precompiles
    • 💀Potential Vulnerabilities in EVM Implementations: Overlooked DelegateCall in Precompiled Contracts
    • 👓Real World Examples
    • 🤝Delegatecall and Precompiles: Conclusion
  • 🌊Tutorial 13: Liquid Staking
    • 🚀Prerequisites
    • 📘Understanding Liquid Staking
    • 💀Understanding Liquid Staking Vulnerabilities
    • 🛑Example Vulnerability
    • 🐜Example Vulnerability 2
    • 🕷️Example Vulnerability 3
    • 🤝Liquid Staking: Conclusion
  • 🚿Tutorial 14: Slippage
    • 🚀Prerequisites
    • 📘Understanding Slippage in Automated Market Makers (AMMs)
    • 💀Understanding the "Lack of Slippage Check" Vulnerability in Automated Market Makers (AMMs) and DEXs
    • 😡On-Chain Slippage Calculations Vulnerability
    • 📛0 slippage tolerance vulnerability
    • 👓Real World Examples
    • 🏦Resource bank to more slippage vulnerabilities
    • 🤝Slippage Conclusion
  • 📉Tutorial 15: Oracles
    • 🚀Prerequisites
    • 📘Understanding Oracles
    • 📈Types of price feeds
    • 😡Flash Loans
    • 💀Understanding Oracle Vulnerabilities
      • ⛓️The Danger of Single Oracle Dependence
      • ⬇️Using Deprecated Functions
      • ❌Lack of return data validation
      • 🕐Inconsistent or Absent Price Data Fetching/Updating Intervals
    • 🔫Decentralized Exchange (DEX) Price Oracles Vulnerabilities
    • 🛑Found Vulnerabilities In Oracle Implementations
      • ⚖️Newly Registered Assets Skew Consultation Results
      • ⚡Flash-Loan Oracle Manipulations
      • ⛓️Relying Only On Chainlink: PriceOracle Does Not Filter Price Feed Outliers
      • ✍️Not Validating Return Data e.g Chainlink: (lastestRoundData)
      • 🗯️Chainlink: Using latestAnswer instead of latestRoundData
      • 😭Reliance On Fetching Oracle Functionality
      • 🎱Wrong Assumption of 18 decimals
      • 🧀Stale Prices
      • 0️⃣Oracle Price Returning 0
      • 🛶TWAP Oracles
      • 😖Wrong Token Order In Return Value
      • 🏗️miscellaneous
    • 🤝Oracles: Conclusion
  • 🧠Tutorial 16: Zero Knowledge (ZK)
    • 🚀Prerequisites
    • 📚Theory
      • 🔌Circom
      • 💻Computation
      • 🛤️Arithmetic Circuits
      • 🚧Rank-1 Constraint System (R1CS)
      • ➗Quadratic Arithmetic Program
      • ✏️Linear Interactive Proof
      • ✨ZK-Snarks
    • 🤓Definitions and Essentials
      • 🔑Key
      • 😎Scalar Field Order
      • 🌳Incremental Merkle Tree
      • ✒️ECDSA signature
      • 📨Non-Interactive Proofs
      • 🏝️Fiat-Shamir transformation (or Fiat-Shamir heuristic)
      • 🪶Pedersen commitment
    • 💀Common Vulnerabilities in ZK Code
      • ⛓️Under-constrained Circuits
      • ❗Nondeterministic Circuits
      • 🌊Arithmetic Over/Under Flows
      • 🍂Mismatching Bit Lengths
      • 🌪️Unused Public Inputs Optimized Out
      • 🥶Frozen Heart: Forging of Zero Knowledge Proofs
      • 🚰Trusted Setup Leak
      • ⛔Assigned but not Constrained
    • 🐛Bugs In The Wild
      • 🌳Dark Forest v0.3: Missing Bit Length Check
      • 🔢BigInt: Missing Bit Length Check
      • 🚓Circom-Pairing: Missing Output Check Constraint
      • 🏹Semaphore: Missing Smart Contract Range Check
      • 🔫Zk-Kit: Missing Smart Contract Range Check
      • 🤖Aztec 2.0: Missing Bit Length Check / Nondeterministic Nullifier
      • ⏸️Aztec Plonk Verifier: 0 Bug
      • 🪂0xPARC StealthDrop: Nondeterministic Nullifier
      • 😨a16z ZkDrops: Missing Nullifier Range Check
      • 🤫MACI 1.0: Under-constrained Circuit
      • ❄️Bulletproofs Paper: Frozen Heart
      • 🏔️PlonK: Frozen Heart
      • 💤Zcash: Trusted Setup Leak
      • 🚨14. MiMC Hash: Assigned but not Constrained
      • 🚔PSE & Scroll zkEVM: Missing Overflow Constraint
      • ➡️PSE & Scroll zkEVM: Missing Constraint
      • 🤨Dusk Network: Missing Blinding Factors
      • 🌃EY Nightfall: Missing Nullifier Range Check
      • 🎆Summa: Unconstrained Constants Assignemnt
      • 📌Polygon zkEVM: Missing Remainder Constraint
    • 💿ZK Security Resources
  • 🤝Tutorial 17 DEX's (Decentralized Exchanges)
    • 🚀Prerequisites
    • 📚Understanding Decentralized Exchanges
    • 💀Common Vulnerabilities in DEX Code
      • 🔎The "Lack of Slippage Check" Vulnerability in Automated Market Makers (AMMs) a
      • 😡On-Chain Slippage Calculations Vulnerability
      • 📛Slippage tolerance vulnerability
      • 😵How Pool Implementation Mismatches Pose a Security Risk to Decentralized Exchanges (DEXs)
      • 🏊‍♂️Vulnerabilities in Initial Pool Creation - Liquidity Manipulation Attacks
      • 🛑Vulnerabilities In Oracle Implementations
        • ⚖️Newly Registered Assets Skew Consultation Results
        • ⚡Flash-Loan Oracle Manipulations
        • ⛓️Relying Only On Chainlink: PriceOracle Does Not Filter Price Feed Outliers
        • ✍️Not Validating Return Data e.g Chainlink: (lastestRoundData)
        • 🗯️Chainlink: Using latestAnswer instead of latestRoundData
        • 😭Reliance On Fetching Oracle Functionality
        • 🎱Wrong Assumption of 18 decimals
        • 🧀Stale Prices
        • 0️⃣Oracle Price Returning 0
        • 🛶TWAP Oracles
        • 😖Wrong Token Order In Return Value
        • 🏗️miscellaneous
      • 🥶Minting and Burning Liquidity Pool Tokens
      • 🎫Missing Checks
      • 🔞18 Decimal Assumption
        • 📌Understanding ERC20 Decimals
        • 💀Examples Of Vulnerabilities To Do With Assuming 18 Decimals
      • 🛣️Incorrect Swap Path
      • The Importance of Deadline Checks in Swaps
    • 🤝Conclusion
  • 🤖Tutorial 18: Proxies
    • 🚀Prerequisites
    • 📥Ethereum Storage and Memory
    • 📲Ethereum Calls and Delegate Calls
    • 💪Upgradability Patterns in Ethereum: Enhancing Smart Contracts Over Time
    • 🔝Proxy Upgrade Pattern in Ethereum
    • 🌀Exploring the Landscape of Ethereum Proxies
      • 🪞Transparent Proxies
      • ⬆️UUPS Proxies
      • 💡Beacon Proxies
      • 💎Diamond Proxies
  • 🔞Tutorial 19: 18 Decimal Assumption
    • 🚀Prerequisites
    • 📌Understanding ERC20 Decimals
    • 💀Examples Of Vulnerabilities To Do With Assuming 18 Decimals
    • 🤝Conclusion
  • ➗Tutorial 20: Arithmetic
    • 🚀Prerequisites
    • 🕳️Arithmetic pitfall 1: Division by 0
    • 🔪Arithmetic pitfall 2: Precision Loss Due To Rounding
    • 🥸Arithmetic pitfall 3: Erroneous Calculations
    • 🤝Conclusion
  • 🔁Tutorial 21: Unbounded Loops
    • 🚀Prerequisites
    • ⛽Gas Limit Vulnerability
    • 📨Transaction Failures Within Loops
    • 🤝Conclusion
  • 📔Tutorial 22: `isContract`
    • 🚀Prerequisites
    • 💀Understanding the 'isContract()` vulnerability
    • 🤝Conclusion
  • 💵Tutorial 23: Staking
    • 🚀Prerequisites
    • 💀First Depositor Inflation Attack in Staking Contracts
    • 🌪️Front-Running Rebase Attack (Stepwise Jump in Rewards)
    • ♨️Rugability of a Poorly Implemented recoverERC20 Function in Staking Contracts
    • 😠General Considerations for ERC777 Reentrancy Vulnerabilities
    • 🥏Vulnerability: _lpToken and Reward Token Confusion in Staking Contracts
    • 🌊Slippage Checks
    • 🌽The Harvest Functionality in Vaults: Issues and Best Practices
  • ⛓️Tutorial 24: Chain Re-org Vulnerability
    • 🚀Prerequisites
    • ♻️Chain Reorganization (Re-org) Vulnerability
    • 🧑‍⚖️Chain Re-org Vulnerability in Governance Voting Mechanisms
  • 🌉Tutorial 25: Cross Chain Bridges Vulnerabilities
    • 🚀Prerequisites
    • ♻️ERC777 Bridge Vulnerability: Reentrancy Attack in Token Accounting
      • 🛑Vulnerability: Withdrawals Can Be Locked Forever If Recipient Is a Contract
    • 👛The Dangers of Not Using SafeERC20 for Token Transfers
    • Uninitialized Variable Vulnerability in Upgradeable Smart Contracts
    • Unsafe External Calls and Their Vulnerabilities
    • Signature Replay Attacks in Cross-Chain Protocols
  • 🚰Tutorial 26: Integer Underflow and Overflow Vulnerabilities in Solidity (Before 0.8.0)
    • 🚀Prerequisites
    • 💀Understanding Integer Underflow and Overflow Vulnerabilities
    • 🤝Conclusion
  • 🥏Tutorial 27: OpenZeppelin Vulnerabilities
    • 🚀Prerequisites
    • 🛣️A Guide on Vulnerability Awareness and Management
      • 🤝Conclusion
  • 🖊️Tutorial 28: Signature Vulnerabilities / Replays
    • 🚀Prerequisites
    • 🔏Reusing EIP-712 Signatures in Private Sales
    • 🔁Replay Attacks on Failed Transactions
    • 📃Improper Token Validation in Permit Signature
  • 🤝Tutorial 29: Solmate Vulnerabilities
    • 🔏Lack of Code Size Check in Token Transfer Functions in Solmate
  • 🧱Tutorial 30: Inconsistent block lengths across chains
    • 🕛Incorrect Assumptions about Block Number in Multi-Chain Deployments
  • 💉Tutorial 31: NFT JSON and XSS injection
    • 📜Vulnerability: JSON Injection in tokenURI Functions
    • 📍Cross-Site Scripting (XSS) Vulnerability via SVG Construction in Smart Contracts
  • 🍃Tutorial 32: Merkle Leafs
    • 🖥️Misuse of Merkle Leaf Nodes
  • 0️Tutorial 33: Layer 0
    • 📩Lack of Force Resume in LayerZero Integrations
    • ⛽LayerZero-Specific Vulnerabilities in Airdropped Gas and Failure Handling
    • 🔓Understanding the Vulnerability of Blocking LayerZero Channels
    • 🖊️Copy of Understanding the Vulnerability of Blocking LayerZero Channels
  • ♻️Tutorial 34: Forgetting to Update the Global State in Smart Contracts
  • ‼️Tutorial 35: Wrong Function Signature
  • 🛑Tutorial 36: Handling Edge Cases of Banned Addresses in DeFi
  • Tutorial 37: initializer and onlyInitializing
  • ➗Tutorial 38: Eigen Layer
    • 📩Denial of Service in NodeDelegator Due to EigenLayer's maxPerDeposit Check
    • 📈Incorrect Share Issuance Due to Strategy Updates in EigenLayer Integrations
    • 🔁nonReentrant Vulnerability in EigenLayer Withdrawals
  • ⚫Tutorial 39: Wormhole
    • 📩Proposal Execution Failure Due to Guardian Set Change
  • 💼Tutorial 40: Uniswap V3
    • 📩Understanding and Mitigating Partial Swaps in Uniswap V3
    • 🌊Underflow Vulnerability in Uniswap V3 Position Fee Growth Calculations
    • ➗Handling Decimal Discrepancies in Uniswap V3 Price Calculations
  • 🔢Tutorial 41: Multiple Token Addresses in Proxied Tokens
    • 🔓Understanding Vulnerabilities Arising from Tokens with Multiple Entry Points
  • 🤖Tutorial 42: abiDecoder v2
    • 🥥Vulnerabilities from Manipulated Token Interactions Using ABI Decoding
  • ❓Tutorial 43: On-Chain Randomness
    • Vulnerabilities in On-Chain Randomness and How It Can Be Exploited
  • 😖Tutorial 44: Weird ERC20 Tokens
    • Weird Token List
  • 🔨Tutorial 45: Hardcoded stable coin values
  • ❤️Tutorial 46: The Risks of Chainlink Heartbeat Discrepancies in Smart Contracts
  • 👣Tutorial 47: The Risk of Forgetting a Withdrawal Mechanism in Smart Contracts
  • 💻Tutorial 48: Governance and Voting
    • Flash Loan Voting Exploit
    • Exploiting Self-Delegation
    • 💰Missing payable Keyword in Governance Execute Function
    • 👊Voting Multiple Times by Shifting Delegation
    • 🏑Missing Duplicate Veto Check
  • 📕Tutorial 49: Not Conforming To EIP standards
    • 💎Understanding EIP-2981: NFT Royalty Standard
    • 👍Improper Implementation of EIP-2612 Permit Standard
    • 🔁Vulnerabilities of Missing EIP-155 Replay Attack Protection
    • ➡️Vulnerabilities Due to Missing EIP-1967 in Proxy Contracts
    • 🔓Vulnerability of Design Preventing EIP-165 Extensibility
    • 🎟️The Dangers of Not Properly Implementing ERC-4626 in Yield Vaults
    • 🔁EIP-712 Implementation and Replay Attacks
  • ⏳Tutorial 50: Vesting
    • 🚔Vulnerability of Allowing Unauthorized Withdrawals in Vesting Contracts
    • 👊Vulnerability of Unbounded Timelock Loops in Vesting Contracts
    • ⬆️Vulnerability of Incorrect Linear Vesting Calculations
    • ⛳Missing hasStarted Modifier
    • 🔓Vulnerability in Bond Depositor's Vesting Period Reset
  • ⛽Tutorial 51: Ethereum's 63/64 Gas Rule
    • 🛢️Abusing Ethereum's 63/64 Gas Rule to Manipulate Contract Behavior
  • 📩Tutorial 52: NPM Dependency Confusion and Unclaimed Packages
    • 💎Exploiting Unclaimed NPM Packages and Scopes
  • 🎈Tutorial 53: Airdrops
    • 🛄Claiming on Behalf of Other Users
    • 🧲Repeated Airdrop Claims Vulnerability
    • 🍃Airdrop Vulnerability – Merkle Leaves and Parent Node Hash Collisions
  • 🎯Tutorial 54: Precision
    • 🎁Vulnerabilities Due to Insufficient Precision in Reward Calculations
    • Min-Shares: Fixed Minimum Share Values for Tokens with Low Decimal Precision
    • Vulnerability Due to Incorrect Rounding When the Numerator is Not a Multiple of the Denominator
    • Vulnerability from Small Deposits Being Rounded Down to Zero Shares in Smart Contracts
    • Precision Loss During Withdrawals from Vaults Can Block Token Transfers Due to Value < Amount
    • 18 Decimal Assumption Scaling: Loss of Precision in Asset Conversion Due to Incorrect Scaling
  • Tutorial 55: AssetIn == AssetOut, FromToken == ToToken
    • 🖼️Vulnerability: Missing fromToken != toToken Check
  • 🚿Tutorial 56: Vulnerabilities Related to LP Tokens Being the Same as Reward Tokens
    • 🖼️Vulnerabilities Caused by LP Tokens Being the Same as Reward Tokens
  • Tutorial 57: Unsanitized SWAP Paths and Arbitrary Contract Call Vulnerabilities
    • 📲Arbitrary Contract Calls from Unsanitized Paths
  • Tutorial 58: The Risk of Infinite Approvals and Arbitrary Contract Calls
    • 🪣Exploiting Infinite Approvals and Arbitrary Contract Calls
  • Tutorial 59: Low-Level Calls in Solidity Returning True for Non-Existent Contracts
    • Low-Level Calls Returning True for Non-Existent Contracts
  • 0️⃣Tutorial 60: The Impact of PUSH0 and the Shanghai Hardfork on Cross-Chain Deployments > 0.8.20
    • PUSH0 and Cross-Chain Compatibility Challenges
  • 🐍Tutorial 61: Vyper Vulnerable Versions
    • Vyper and the EVM
  • ⌨️Tutorial 62: Typos in Smart Contracts — The Silent Threat Leading to Interface Mismatch
    • Vyper and the EVM
  • ☁️Tutorial 63: Balance Check Using ==
    • The Vulnerability: == Balance Check
  • 💍Tutorial 64: Equal Royalties for Unequal NFTs
    • Understanding the Problem: Equal Royalties for Unequal NFTs
  • 🖼️Tutorial 65: ERC721 and NFTs
    • The Risk of Using transferFrom Instead of safeTransferFrom in ERC721 Projects
    • ❄️Why _safeMint Should Be Used Instead of _mint in ERC721 Projects
    • The Importance of Validating Token Types in Smart Contracts
    • 📬Implementing ERC721TokenReceiver to Handle ERC721 Safe Transfers
    • NFT Implementation Deviating from ERC721 Standard in Transfer Functions
    • NFT Approval Persistence after Transfer
    • 🎮Gameable NFT Launches through Pseudo-Randomness
    • 2️⃣Protecting Buyers from Losing Funds Due to Claimed NFT Rewards on Secondary Markets
    • ♻️Preventing Reentrancy When Using SafeERC721
    • 🖊️Preventing Re-use of EIP-712 Signatures in NFT Private Sales
  • 2️⃣Tutorial 66: Vulnerability Arising from NFTs Supporting Both ERC721 and ERC1155 Standards
  • 📷Tutorial 67: ERC1155 Vulnerabilities
    • ♻️Preventing Reentrancy in OpenZeppelin's SafeERC1155
    • 🛫Vulnerabilities in OpenZeppelin's ERC1155Supply Contract
    • Understanding Incorrect Token Owner Enumeration in ERC1155Enumerable
    • Avoiding Breaking ERC1155 Composability with Improper safeTransferFrom Implementation
    • 💍Ensuring Compatibility with EIP-2981 in ERC1155 Contracts
  • 🪟Informational Vulnerabilities
  • ⛽Gas Efficiency
  • 💻Automation Tools
  • 🔜Out Of Gas (Coming Soon)
  • 🔜DEX Aggregators (Coming Soon)
  • 🔜Bribes (Coming Soon)
  • 🔜Understanding Compiled Bytecode (coming soon)
  • 🔜Deployment Mistakes (coming soon)
  • 🔜Optimistic Roll-ups (coming soon)
  • 🔜Typos (coming soon)
  • 🔜Try-Catch (coming soon)
  • 🔜NFT Market-place (coming soon)
  • 🔜Upgrade-able Contracts (coming soon)
Powered by GitBook
On this page
  • Merkle Tree
  • Real use case
  1. Tutorial 6: Merkle Trees

Understanding Merkle Trees

PreviousPrerequisitesNextVerification within a Merkle Tree:

Last updated 1 year ago

Understanding Merkle Trees: The foundation of blockchain security relies heavily on a peculiar data structure known as a Merkle Tree. Named after its inventor, Ralph Merkle, it is a binary tree comprised of a set of nodes, where each node represents a certain amount of data. But what makes it so crucial in the context of blockchain and smart contract security? Let's unravel this concept piece by piece.

What is a Data Structure: A data structure is a particular way of organizing and storing data in a computer so that it can be accessed and modified efficiently. Common types of data structures include arrays, linked lists, stacks, queues, hash tables, trees, and graphs. The choice of data structure often depends on the specific requirements of the task at hand, such as the need for fast access, large storage capacity, or maintaining relationships between data elements.

What is a Binary Tree: A binary tree is a type of tree data structure in which each node has at most two children, referred to as the left child and the right child. This characteristic makes it easier to search for data, insert new data, and delete data, as you only need to compare and potentially modify two child nodes at each step.

What is a Node: A node is each circle you see in the image above. In the context of data structures, a node is a basic unit of information. It can contain various data types, such as a number, a string, or a reference or pointer to another node. In more complex data structures like trees and graphs, a node often contains additional information to describe its relationship to other nodes, such as links to its child nodes and/or its parent node.

What is a Hash Function: A hash function is a function that takes an input (or 'message') and returns a fixed-size string of bytes. The output, often a 'digest', is unique to each unique input: a small change to the input will produce such a drastic change in output that the new hash value appears uncorrelated with the old hash value. In the context of blockchain and cryptography, hash functions are used to maintain data integrity and security. For Ethereum and its Merkle trees, the Keccak-256 hash function (a variant of SHA-3) is commonly used.

Merkle Tree

A Merkle Tree is a kind of data structure used in computing. A data structure is simply a way of organizing and storing data so it can be used efficiently.

Let's break down the definition of Merkle Tree into simpler parts:

  1. What is a tree in computing? Imagine a family tree, with the "ancestors" at the top and "descendants" below. A tree in computing is similar. It starts with a single "root" node at the top, which branches out to "child" nodes, and each of those can have their own "child" nodes, and so on. The nodes at the very bottom that have no children are called "leaves".

  2. What is a cryptographic hash? A hash is a way of taking any amount of input data (like a sentence, a document, or a whole book), and creating a fixed-size string of characters from it, which looks random. What's special about it is that any small change to the input will completely change the hash, and you can't go backwards from the hash to the original data. Cryptographic hash functions are special types of hash functions that are particularly secure, meaning they're very hard to reverse or find collisions (two different inputs producing the same hash).

  3. How is a Merkle Tree built? You start with your data blocks at the bottom as the "leaves" of the tree. Each of these data blocks is turned into a hash. Then, each pair of these hashes is combined and hashed again to get the hash for the node above them. This process continues up the tree, until you get a single hash at the top, which is the "root" of the tree.

  4. What is a Merkle Tree used for? Merkle Trees are used in places where you need to verify data quickly and securely. For example, they are used in blockchain technology (which underlies cryptocurrencies like Bitcoin) to ensure all transactions are valid without having to check every single one.

In essence, Merkle Trees allow you to check if a small piece of data is part of a large dataset very quickly and with a high degree of certainty. If you have the top hash and a way to move down the tree, you can check if a piece of data is included in the original dataset without needing to have the entire dataset.

1. Basic Terminology: In a Merkle tree:

  • Leaf nodes are the lowest nodes in the tree (they have no children) and represent the data blocks.

  • Parent nodes are labeled with a cryptographic hash of their child nodes.

  • The root node (the top node of the tree) is known as the Merkle root.

2. Construction of a Merkle Tree:

Let's consider we have 4 data blocks - L1, L2, L3, and L4.

Step 1: We start by taking the cryptographic hashes of the data blocks.

  • HL1 = Hash(L1)

  • HL2 = Hash(L2)

  • HL3 = Hash(L3)

  • HL4 = Hash(L4)

Here, 'Hash' can be any cryptographic hash function like SHA256.

Step 2: The hashes are paired and concatenated, and the hash of the resulting strings are calculated.

  • HL12 = Hash(HL1 + HL2)

  • HL34 = Hash(HL3 + HL4)

Here, '+' represents concatenation.

Step 3: These hashes are again concatenated and hashed to form the root.

  • ROOT = Hash(HL12 + HL34)

This ROOT is the Merkle Root of our Merkle Tree.

Just like before, you can use this tree for efficient and secure verification of the contents of these data blocks. The process is exactly the same, but with the updated block names.

Real use case

let's delve into a practical example of how Merkle Trees are utilized in blockchain technology for handling transactions.

Let's say we have a set of Bitcoin transactions:

  • Transaction 1 (T1): Alice sends 1 BTC to Bob

  • Transaction 2 (T2): Bob sends 0.5 BTC to Charlie

  • Transaction 3 (T3): Charlie sends 0.2 BTC to Dave

  • Transaction 4 (T4): Dave sends 0.1 BTC to Alice

These four transactions form our "data blocks". Now we will form a Merkle Tree from these transactions just like we did before.

Step 1: We hash each transaction (this is equivalent to HL1 = Hash(L1) and so on):

  • HT1 = Hash(T1)

  • HT2 = Hash(T2)

  • HT3 = Hash(T3)

  • HT4 = Hash(T4)

Step 2: Pair and concatenate the hashes, then hash the resulting strings (equivalent to HL12 = Hash(HL1 + HL2) and so on):

  • HT12 = Hash(HT1 + HT2)

  • HT34 = Hash(HT3 + HT4)

Step 3: Concatenate and hash the pair hashes to form the root:

  • ROOT = Hash(HT12 + HT34)

The resulting ROOT is our Merkle Root, which provides a "fingerprint" or summary of all the transactions.

Now let's say a new participant, Eve, joins the Bitcoin network and wants to verify Transaction 3 (T3). Instead of downloading the whole blockchain, she can download just the Merkle Path, or the shortest path to T3, which includes HT4 (T3's pair) and HT12 (the sibling of T3's parent). Eve can now compute:

  • HT34 = Hash(HT3 + HT4)

  • ROOT = Hash(HT12 + HT34)

If this computed ROOT matches the actual Merkle Root of the blockchain, Eve can be sure that T3 is indeed part of the block. This is how the use of Merkle Trees makes verification of transactions efficient and secure in the blockchain.

Properties and Advantages of Merkle Trees:

  1. Efficiency: In a Merkle Tree, each leaf node represents a block of data, and each non-leaf node is a hash of its child nodes. The most important aspect of this structure is the Merkle root, a single hash at the top of the tree that represents all the data underneath it. To verify whether a specific block of data is part of the dataset, you don't need to look at every piece of data. You only need to look at a specific path from the root to the leaf node representing that data, along with some additional nodes (known as siblings) along the way. This path, known as a Merkle proof, is significantly smaller than the entire dataset, which makes the verification process very efficient, even for large datasets.

  2. Security: The Merkle root changes if any part of the data it represents changes. That's because a small change in any block of data will change its hash, which will change the hashes of nodes above it in the tree, all the way up to the root. This means that it's practically impossible to change the data without also changing the Merkle root. This feature makes Merkle Trees very secure - any tampering with the data would be immediately noticeable when comparing the Merkle root.

  3. Proof of Integrity: If two parties have the same Merkle root, they know they have the exact same version of the data. That's because it's practically impossible to have two different datasets produce the same Merkle root, due to the properties of cryptographic hash functions. This means that Merkle Trees provide a very strong assurance of data integrity. If the Merkle roots match, the datasets match.

Understanding these properties is crucial for understanding systems that use Merkle Trees. These include blockchain systems, where Merkle Trees are used to efficiently verify transactions; torrent systems, where they're used to ensure file integrity; and other distributed systems, where they're used to synchronize data across multiple nodes. The security and efficiency provided by Merkle Trees are essential to the functioning of these systems.

🌴
📘
Book an audit with Zokyo