Understanding the Merkle Tree and How It Works
Newbies to cryptocurrency who come across the phrase “Merkle Tree” for the first time might wonder how trees suddenly exist in the world of cryptocurrency and blockchain. But yes, there is indeed a tree named as such in the world of computing and complicated mathematical processes.
Put simply, a Merkle Tree, also called a binary tree, is a method of structuring data that enables quick and efficient verification of huge chunks of information.
This method was developed by Ralph Merkle, a computer scientist who is also one of the inventors of public key cryptography. He wrote about his concept in 1979 in an academic paper while he was a graduate student at Stanford University.
According to his definition, the Merkle Tree “comprises a method of providing a digital signature for purposes of authentication of a message, which utilizes an authentication tree function of a one-way function of a secret number.”
In other words, it is a process that enables computers to verify data much faster than ever before.
Digging Deeper into the Merkle Tree
To understand a Merkle Tree, first note that each blockchain transaction has its unique transaction ID, which is usually a 64-character code that uses up to 256 bits of memory.
Since blockchains consist of thousands and thousands of blocks, and each block contains thousands of transactions, it is not surprising that memory space is the biggest challenge for blockchains.
For this reason, it’s important to use as little data as possible during the data processing and verification processes. Not only does this shorten the processing time, but it also provides the highest level of security.
What a Merkle Tree does is to process thousands of transaction IDs into a single, 64-bit code. That code serves as proof that a transaction has taken place, and because the code is short, the computer can process it much faster and more efficiently. That code is called the Merkle Root.
Merkle Tree in Action
The above diagram is a sample of a Merkle Tree. The first-level blocks (T0–T7) are ordinary transactions. These undergo a hash function, which gives them a hash value, H0–H7. These second-level blocks, called leaves, contain the hashed value of the record associated with that leaf.
After that, pairs of hash values are combined and rehashed, so the individual values become H01, H23, H45, and H67. The process is repeated, with the combined hash valued combined and rehashed again. They then receive new hash values, H0123 and H4567. These combined hashes (levels 3 and 4) are the branches or nodes of the Merkle tree.
The final hash will combine all the values to create a single value, H01234567. That value is the Merkle Root, and it carries a summary of all the transaction data.
Merkle Tree Benefits and Proofs
From the figure above we can see the different benefits of the Merkle Tree:
– It serves as a proof of the integrity and validity of the data.
– It does not need much disk space and proofs, because of its streamlined computation process.
– Its proofs and management require only a small amount of information to be sent across a network.
All these benefits are possible thanks to the verification process implemented in the Merkle Tree. This process includes consistency verification and data verification.
Also called consistency verification, this process lets you verify that any two versions of a log are consistent, and shows that the record has not been tampered with.
The consistency proof works on the following processes:
– The new version includes all the information in the old version
– Everything is in the same order
– All new records are placed after the older version.
A log is deemed consistent if you can prove that:
– The certificates are not backdated before placed into the log.
– The certificates have not been modified in the log.
– The log has never been forked or branched.
Also called data verification, this process allows you to validate whether a specific data item has been included in the log. Like the consistency proof, the log serves as the evidence of the record. Audit proofs are usually done so auditors can verify certificates of transport layer security (TLS) clients. If an audit proof yields a root hash that doesn’t match the Merkle tree hash, it can only mean one thing – the certificate does not exist in the log.
A Merkle Tree is beneficial for the data synchronization of a distributed data store because it enables all nodes in a distributed system to efficiently and quickly determine which records have changed – without sending all the data to make a comparison. Instead, if the auditor can verify that a particular leaf has been changed, only the record in that specific leaf must be sent over the network.
Industries that Use the Merkle Tree
Merkle Trees and their variations are used by some of the most popular cryptocurrencies, such as Bitcoin and Ethereum, as well as other systems:
Health Care Industry
DeepMind Health, Google’s AI-powered health technology division, is planning to use a new technology based on bitcoin. Called Verifiable Data Audit, the project aims to make a digital ledger where all the data of doctor–patient interactions will be automatically stored and cryptographically verified. Hospitals, doctors, and patients will be able to track what is going on with personal data in real-time; thus, they will be able to monitor any changes to and access the data.
Global Supply Chain
Large technology and shipping companies like IBM and Maersk are teaming up to utilize blockchain technology to digitize all kinds of shipping transactions within the network of freight forwarders, shippers, customs authorities, and ocean carriers that make up the supply chain.