LOCUS Chain Technology Series 7: Dynamic Sharding Chapter 1
LOCUS CHAIN TECH SERIES 7: DYNAMIC SHARDING Chapter 1.
Author: LOCUS CHAIN FOUNDATION
Compiler: ChainCatcher
Introduction to Locus Chain Sharding
The Practical Bottleneck of Blockchain Scalability
The design goal of Locus Chain is to provide a practical blockchain infrastructure suitable for real-world use. By "real-world use," we mean a transaction capacity comparable to actual commercial transactions, such as credit card transactions. Locus Chain assumes a capacity of at least 4,000 transactions per second, which is nearly equivalent to VISA's capacity. Additionally, our goal of "practical" refers to the hardware and software requirements for users. Locus Chain assumes that the hardware foundation for nodes enabling smart contracts is a typical consumer-grade PC today. The requirements for purely transactional nodes are even lower, suitable for IoT-level micro-devices.
In short, Locus Chain aims to be a blockchain system with 4k-TPS capability on PCs, IoT devices, and mobile phones.
From the perspective of computational resources, the resource requirements for Locus Chain are very clear. As capacity increases, the required resources will also grow. Transactions require bandwidth for communication, CPU time for processing, and storage for historical records or the "ledger."
Let’s first look at bandwidth. Assuming the average size of a transaction is 500 bytes (or 0.5 Kbytes), then the raw data for 4,000 transactions per second is 4,000 * 500 bytes, which equals 2 million bytes per second. This simple data volume is about 20% of a typical 100Mbps home network. Actual blockchain operations may require 5~10x the data, approximately 20MB/sec, for retransmissions and blockchain accounting. In fact, a bandwidth of 20MB/sec is available on today’s high-end home networks, but it is often not feasible for the average user.
Fortunately, the CPU requirements are negligible. Locus Chain employs a PoS-based BFT consensus that does not require high CPU computation for cryptographic puzzles. The minimum configuration of Locus Chain can run on ARM-based mini devices costing less than $10.
Storage requirements are the trickiest issue. Four thousand transactions per second require 2M bytes, totaling 172GB of data per day. Today's average PC has about 500GB of SSD storage, and 172GB would fill the entire storage in less than three days.
In summary, bandwidth and storage requirements are the bottlenecks for practical blockchains. This understanding is widely shared among leading blockchain development teams, which address it using fundamentally the same approach: "sharding."
General Sharding Approach
Sharding is a divide-and-conquer approach to solving the data volume problem. Sharding means dividing the system into "fragments," which are multiple independent sub-parts within the blockchain system. Data is then partitioned and processed by each shard.
From the perspective of resource partitioning, there are different sharding techniques.
Network sharding involves diving the communication network. The communication network is divided into several sub-networks, known as "network shards," where transactions generated within a shard are typically processed within that specific shard. Network sharding usually reduces the bandwidth consumption per node by decreasing the number of communications.
Ledger sharding is about partitioning the stored data. The ledger, which is essentially the blockchain history, is divided into multiple sub-ledgers and stored across different sub-networks. Ledger sharding typically reduces the storage space required for each node by decreasing the amount of ledger data stored.
Consensus protocols can also be shared and can improve the overall transaction throughput of the blockchain by processing transactions in parallel.
These techniques can be applied together to enhance overall performance. For example, ledger sharding can be combined with network sharding to store ledger data within a specific shard. Consensus protocols can run within specific ledger shards to minimize communication regarding block generation protocols.
Pitfalls of Sharding
While the sharding approach is the correct method to address the data volume problem, it can introduce new issues for blockchain systems.
One typical issue associated with ledger sharding is data synchronization. When the ledger is sharded into independent sub-ledgers, shards do not acknowledge direct knowledge about other shards. In other words, users in one shard cannot directly assess the truthfulness of information regarding other shards. This problem is often addressed by introducing a super blockchain, which is a management blockchain that tracks sub-ledgers. However, in some cases, a super blockchain alone is insufficient to synchronize inter-shard transactions that modify data across multiple shards.
A well-known issue regarding network sharding is security. When the entire network is divided into N shards, the number of nodes in each sub-shard is effectively reduced by N. The reduction in the number of nodes also decreases the number of nodes required to attack the blockchain network.
Thus, sharding reduces the overall stability of the network. Essentially, there is a trade-off between transaction throughput and network security, depending on the number of shards.
If shards have different numbers of nodes, the problem worsens. Attackers may focus on shards with the fewest nodes to gain control over them. A simple way to address this situation is to increase the total number of nodes; however, this is not a technically guaranteed solution.
Locus Chain's Approach
From the beginning, the design philosophy of Locus Chain has been based on observations of blockchain scalability. High-throughput blockchains have clear technical limitations in terms of bandwidth and storage requirements, leading to the sharding design of Locus Chain. There are explicit security requirements for Locus Chain, resulting in clear low-threshold hardware requirements to encourage participation.
Locus Chain's design starts from the ground up, specifically the ledger structure. The ledger of Locus Chain is a combination of Account-Wise-Transaction-Chains (AWTC). AWTC is a data structure based on a Directed Acyclic Graph (DAG), specifically designed for ledger sharding and resizing.
Locus Chain's sharding is dynamic, with shard configurations that change dynamically. The number of shards varies based on transaction throughput and the number of nodes. The number of nodes between shards is dynamically balanced to maintain security equilibrium among shards. Nodes move dynamically between shards, bringing stability to the entire network.
Locus Chain incorporates verifiable pruning as a solution to storage space issues. Nodes may prune data that is not directly related to the ledger to minimize storage space, while the integrity of the ledger can still be verified. If needed, pruned data can be retrieved through a data query protocol.
Due to the nature of sharding and pruning, there exists unknown ledger data from the perspective of the nodes. Locus Chain's data and network layers provide built-in support for retrieving and verifying unknown, sharded, and pruned data, such as data query protocols and inter-shard communication channels. This communication scheme can also serve as a foundation for inter-shard smart contracts.