Understanding the bottlenecks and optimization methods of Rollup from the performance differences between opBNB and Ethereum Layer2
Author: Faust, Geek Web3
Introduction: If one keyword were to summarize Web3 in 2023, most people might instinctively think of "the summer of Layer2." While innovations at the application layer are emerging one after another, long-term hotspots that can endure like L2 are rare. With Celestia successfully promoting the concept of modular blockchain, Layer2 and modularity have almost become synonymous with infrastructure, making it difficult for the past glory of single-chain solutions to be replicated. After Coinbase, Bybit, and Metamask successively launched their own dedicated Layer2 networks, the Layer2 battle is in full swing, reminiscent of the intense competition among new public chains in the past.
In this Layer2 network battle led by exchanges, BNB Chain is certainly unwilling to be left behind. They launched the zkBNB testnet as early as last year, but due to zkEVM's performance not meeting the needs of large-scale applications, the Optimistic Rollup solution opBNB became a better option for achieving a general Layer2. This article aims to briefly summarize the working principles of opBNB and its commercial significance, outlining an important step taken by the BSC public chain in the era of modular blockchain.
The Path of Large Blocks for BNB Chain
Similar to public chains like Solana and Heco that are supported by exchanges, BNB Chain's public chain BNB Smart Chain (BSC) has long pursued high performance. Since its launch in 2020, the BSC chain has set the gas capacity limit for each block at 30 million, with a stable block interval of 3 seconds. With these parameters, BSC achieved a TPS limit of over 100 (TPS mixed from various transactions). In June 2021, the block gas limit of BSC was raised to 60 million, but in July of that year, a blockchain game called CryptoBlades exploded in popularity on BSC, leading to daily transaction counts exceeding 8 million and skyrocketing fees. It became evident that BSC's efficiency bottleneck was still quite apparent at that time.
(Data Source: BscScan)
To address network performance issues, BSC again raised the gas limit for each block, which then stabilized around 80 million to 85 million for a long time. In September 2022, the gas limit for a single block on BSC Chain was increased to 120 million, and by the end of the year, it rose to 140 million, nearly 5 times the level in 2020. Previously, BSC had planned to raise the block gas capacity limit to 300 million, but perhaps considering the heavy burden on Validator nodes, the proposal for such large blocks has not been implemented.
(Data Source: YCHARTS)
Later, BNB Chain seemed to shift its focus to the modular/Layer2 track, rather than persisting in Layer1 expansion. This intention became increasingly evident from the launch of zkBNB in the second half of last year to GreenField at the beginning of this year. Due to a strong interest in modular blockchain/Layer2, the author of this article will take opBNB as the research object, revealing the performance bottlenecks of Rollup through the differences between opBNB and Ethereum Layer2.
The High Throughput of BSC Enhances the DA Layer of opBNB
As is well known, Celestia has summarized four key components according to the workflow of modular blockchain:
- Execution Layer: The execution environment for executing contract code and completing state transitions;
- Settlement Layer: Handling fraud proofs/validity proofs while addressing bridging issues between L2 and L1.
- Consensus Layer: Achieving consensus on the ordering of transactions.
- Data Availability Layer (DA): Publishing data related to the blockchain ledger so that validators can download this data.
Among them, the DA layer is often coupled with the consensus layer. For example, the DA data of optimistic Rollup contains a sequence of transactions from a batch of L2 blocks. When an L2 full node obtains the DA data, it actually knows the order of each Tx in that batch of transactions. (This is why the Ethereum community considers the DA layer and consensus layer to be related when layering Rollup.)
However, for Ethereum Layer2, the data throughput of the DA layer (Ethereum) becomes the biggest bottleneck limiting Rollup performance, because currently Ethereum's data throughput is too low, forcing Rollup to suppress its TPS to prevent the Ethereum mainnet from being unable to handle the data generated by L2.
At the same time, the low data throughput causes a large number of transaction instructions within the Ethereum network to be pending, which drives gas fees to extremely high levels, further increasing the data publishing costs for Layer2. Ultimately, many Layer2 networks have to adopt DA layers outside of Ethereum, such as Celestia, while opBNB directly uses the high throughput of BSC to achieve DA, solving the bottleneck issue in data publishing.
To facilitate understanding, it is necessary to introduce the data publishing method of Rollup. Taking Arbitrum as an example, the Ethereum chain EOA address controlled by the Layer2 sequencer periodically sends Transactions to a designated contract. In the input parameters of this instruction, calldata is written with packaged transaction data, triggering corresponding on-chain events and leaving a permanent record in the contract logs.
In this way, Layer2 transaction data is permanently stored in Ethereum blocks, and those capable of running L2 nodes can download the corresponding records and parse the relevant data, but Ethereum's own nodes do not execute these L2 transactions. It is evident that L2 merely stores transaction data in Ethereum blocks, incurring storage costs, while the computational costs of executing transactions are borne by L2's own nodes.
What has been described above is the DA implementation method of Arbitrum, while Optimism involves the sequencer-controlled EOA address transferring to another designated EOA address, carrying a new batch of Layer2 transaction data in the additional data. As for opBNB, which uses the OP Stack, its DA data publishing method is essentially consistent with that of Optimism.
It is evident that the throughput of the DA layer will limit the size of data that can be published by Rollup within a unit of time, thereby limiting TPS. Considering that after EIP1559, the gas capacity of each ETH block stabilizes at 30 million, and the block time after the merge is about 12 seconds, Ethereum can process a maximum of only 2.5 million gas per second.
Most of the time, the calldata accommodating L2 transaction data consumes gas = 16 per byte, so Ethereum can handle a maximum calldata size of only 150 KB per second. In contrast, BSC can handle a maximum calldata size of approximately 2910 KB per second, reaching 18.6 times that of Ethereum. The difference between the two as DA layers is evident.
To summarize, Ethereum can carry a maximum of about 150 KB of L2 transaction data per second. Even with the launch of EIP 4844, this number will not change significantly, only reducing DA transaction fees. So, how many transactions can approximately fit into 150 KB per second?
Here, it is necessary to explain the data compression rate of Rollup. Vitalik had overly optimistic estimates in 2021, suggesting that optimistic Rollup could compress transaction data size to 11% of the original. For example, a basic ETH transfer originally occupies 112 bytes of calldata but can be compressed to 12 bytes by optimistic Rollup, ERC-20 transfers can be compressed to 16 bytes, and Swap transactions on Uniswap can be compressed to 14 bytes. According to him, Ethereum could record around 10,000 L2 transaction data per second (mixed types). However, according to data disclosed by Optimism in 2022, the actual data compression rate in practice can only reach about 37%, which is 3.5 times lower than Vitalik's estimate.
(Vitalik's estimation of Rollup's scaling effect largely deviated from reality)
(The various compression rates achieved by Optimism's officially published compression algorithms)
Therefore, we can provide a reasonable figure: even if Ethereum reaches its throughput limit, the maximum TPS of all optimistic Rollups combined is only over 2000. In other words, if the entire space of Ethereum blocks is used to accommodate the data published by optimistic Rollups, such as being divided among Arbitrum, Optimism, Base, Boba, etc., the combined TPS of these optimistic Rollups is simply not enough to reach 3000, even in the most efficient case of compression algorithms. Additionally, we must consider that after EIP1559, the average gas amount carried by each block is only 50% of the maximum value, so the above figures should be halved. With the launch of EIP4844, although the transaction fees for publishing data will be significantly reduced, the maximum size of Ethereum blocks will not change much (too much change would affect the security of the ETH main chain), so the estimated values above will not see much improvement.
According to data from Arbiscan and Etherscan, a certain batch of transactions from Arbitrum contained 1115 transactions and consumed 1.81 million gas on Ethereum. Based on this calculation, if each block of the DA layer is filled, the theoretical TPS limit of Arbitrum is approximately 1500. Of course, considering the L1 block reorganization issue, Arbitrum cannot publish transaction batches on every Ethereum block, so the above figure is currently just theoretical.
At the same time, after the large-scale adoption of smart wallets related to EIP 4337, the DA issue will become even more severe. Because with EIP 4337 support, users can customize their identity verification methods, such as uploading binary data of fingerprints or iris scans, which will further increase the data size occupied by regular transactions. Therefore, Ethereum's low data throughput is the biggest bottleneck limiting Rollup efficiency, and this issue may not be properly resolved for a long time to come.
In contrast, on the BNB Chain's public chain BSC, the average maximum calldata size that can be processed per second is approximately 2910 KB, reaching 18.6 times that of Ethereum. In other words, as long as the execution layer can keep up, the theoretical TPS limit of Layer2 within the BNB Chain system can reach about 18 times that of ARB or OP. This figure is derived from the current maximum gas capacity of 140 million for each block on BNB Chain and a block time of 3 seconds.
This means that the total TPS limit of all Rollups under the public chain of BNB Chain is 18.6 times that of Ethereum (even considering ZKRollup, this holds true). From this point, we can also understand why so many Layer2 projects use DA layers outside of Ethereum to publish data, as the differences are evident.
However, the issue is not that simple. Besides the data throughput problem, the stability of Layer1 itself will also affect Layer2. For example, most Rollups often wait several minutes before publishing a batch of transactions on Ethereum, considering the possibility of L1 block reorganization. If an L1 block is reorganized, it will affect the L2 blockchain ledger. Therefore, after the sequencer publishes an L2 transaction batch, it will wait for multiple new L1 blocks to be published, and only after the probability of block rollback significantly decreases will it publish the next L2 transaction batch. This actually delays the time for L2 blocks to be finally confirmed, reducing the confirmation speed for large transactions (large transactions need to be irreversible to ensure safety).
In summary, transactions occurring in L2 only become irreversible after being published to the DA layer block and after a certain number of new blocks have been generated in the DA layer, which is an important reason limiting Rollup performance. However, Ethereum's slow block speed, with a block being produced every 12 seconds, means that if Rollup publishes an L2 batch every 15 blocks, there will be a 3-minute interval between different batches, and after each batch is published, it must wait for multiple L1 blocks to be generated before it can become irreversible (provided it is not challenged). Clearly, transactions on Ethereum L2 have a long wait time from initiation to irreversibility, resulting in slow settlement speeds; whereas BNB Chain can produce a block in just 3 seconds, and a block becomes irreversible in only 45 seconds (the time taken to generate 15 new blocks).
Based on the current parameters, under the premise of the same number of L2 transactions and considering the irreversibility of L1 blocks, the frequency at which opBNB publishes transaction data can reach a maximum of 8.53 times that of Arbitrum (the former publishes every 45 seconds, while the latter publishes every 6.4 minutes). It is evident that the settlement speed for large transactions on opBNB is much faster than that on Ethereum L2. At the same time, the maximum amount of data published by opBNB each time can reach 4.66 times that of Ethereum L2 (the former has an L1 block gas limit of 140 million, while the latter is 30 million).
8.53 * 4.66 = 39.74, this is the current practical difference in TPS limits between opBNB and Arbitrum (currently ARB seems to actively lower TPS for safety, but theoretically, if it wants to increase TPS, it is still many times behind opBNB).
(Arbitrum's sequencer publishes a transaction batch every 6-7 minutes)
(opBNB's sequencer publishes a transaction batch every 1-2 minutes, with the fastest being just 45 seconds)
Of course, there are more important issues to consider, such as the gas fees of the DA layer. Each time L2 publishes a transaction batch, there is a fixed cost of 21000 gas, which is unrelated to the calldata size—this is also an expense. If the DA layer/L1 transaction fees are high, causing the fixed cost of L2 to remain high, the sequencer will reduce the frequency of publishing transaction batches. At the same time, when considering the components of L2 transaction fees, the cost of the execution layer is very low and can often be ignored, focusing only on the impact of DA costs on transaction fees.
To summarize, when publishing the same size of calldata data on Ethereum and BNB Chain, although the gas consumed is the same, the gas price charged by Ethereum is about 10 to dozens of times that of BNB Chain, which translates to L2 transaction fees. Currently, the user fees for Ethereum Layer2 are also about 10 to dozens of times that of opBNB. Overall, the differences between opBNB and optimistic Rollups on Ethereum are still quite significant.
* (A transaction on Optimism consuming 150,000 gas, with a fee of $0.21)*
(A transaction on opBNB consuming 130,000 gas, with a fee of $0.004)
However, while expanding the data throughput of the DA layer can enhance the overall throughput of the Layer2 system, the performance improvement for individual Rollups is still limited, as the execution layer's speed in processing transactions is often not fast enough. Even if the limitations of the DA layer can be ignored, the execution layer will become the next bottleneck affecting Rollup performance. If the execution layer of Layer2 is very slow, the overflow of transaction demand will spread to other Layer2s, ultimately causing a fragmentation of liquidity. Therefore, improving the performance of the execution layer is also crucial, representing another hurdle above the DA layer.
The Enhancement of opBNB in the Execution Layer: Cache Optimization
When most people discuss the performance bottlenecks of blockchain execution layers, they inevitably mention: the single-threaded serial execution method of EVM cannot fully utilize CPU resources, and the data lookup efficiency of the Merkle Patricia Trie used by Ethereum is too low. These are two significant bottlenecks in the execution layer. In simple terms, the idea of expanding the execution layer is to make better use of CPU resources and to allow the CPU to access data as quickly as possible. The optimization solutions for EVM's serial execution and Merkle Patricia Tree are often complex and not easy to implement, while more cost-effective efforts tend to focus on optimizing caching.
In fact, the issue of cache optimization returns to points often discussed in traditional Web2 and even textbooks.
Typically, the speed at which the CPU reads data from memory is hundreds of times faster than reading data from disk. For example, reading a piece of data from memory takes only 0.1 seconds, while reading it from disk takes 10 seconds. Therefore, reducing the overhead of disk read/write operations, i.e., cache optimization, becomes a necessary part of optimizing the blockchain execution layer.
In Ethereum and most public chains, the database recording the on-chain address states is fully stored on disk, while the so-called world state trie is merely an index of this database, or a directory used when looking up data. Every time the EVM executes a contract, it needs to obtain the relevant address state. If the data must be retrieved one by one from the database stored on disk, it will significantly slow down the transaction execution speed. Thus, setting up a cache outside the database/disk is a necessary means of speeding up.
opBNB directly adopts the cache optimization scheme used by BNB Chain. According to information disclosed by opBNB's partner NodeReal, the earliest BSC chain set up three layers of cache between the EVM and the LevelDB database storing the states, with a design concept similar to traditional three-level caching, placing frequently accessed data in the cache so that the CPU can first look for the needed data in the cache. If the cache hit rate is high enough, the CPU will not need to overly rely on the disk to obtain data, significantly speeding up the entire execution process.
Later, NodeReal added a feature that utilizes idle CPU cores not occupied by the EVM to pre-read data that the EVM will need to process in the future from the database and place it in the cache, allowing the EVM to directly access the needed data from the cache in the future. This feature is called "state pre-reading."
The principle of state pre-reading is quite simple: the CPU of blockchain nodes is multi-core, while the EVM operates in a single-threaded serial execution mode, utilizing only one CPU core, leaving other CPU cores underutilized. To address this, we can have the unused CPU cores assist by identifying what data the EVM will need from the sequence of transactions it has yet to process. Then, these CPU cores outside the EVM will read the data that the EVM will use from the database, helping the EVM reduce the overhead of data retrieval and improve execution speed.
After fully optimizing the cache and pairing it with sufficiently powerful hardware configurations, opBNB has actually brought the performance of the node execution layer close to the limits of EVM: it can process up to 100 million gas per second. 100 million gas is essentially the performance ceiling of an unmodified EVM (based on experimental test data from a certain prominent public chain).
To summarize, opBNB can process a maximum of 4761 of the simplest transfers per second, handle 1500 to 3000 ERC20 transfers, and process about 500 to 1000 SWAP operations (these data are obtained from transaction data on the block explorer). Based on the current parameter comparisons, the TPS limit of opBNB is 40 times that of Ethereum, more than 2 times that of BNB Chain, and over 6 times that of Optimism.
Of course, Ethereum Layer2 cannot fully utilize the execution layer due to the severe limitations of the DA layer. If we consider the DA layer's block time, stability, and other factors mentioned earlier, the actual performance of Ethereum Layer2 will be significantly discounted based on the performance of the execution layer. For a high-throughput DA layer like BNB Chain, the expansion effect of over 2 times for opBNB is very valuable, especially since BNB Chain can support multiple such expansion projects.
It is foreseeable that BNB Chain has already included Layer2 solutions led by opBNB in its layout plans and will continue to incorporate more modular blockchain projects in the future, including introducing ZK proofs into opBNB and pairing them with supporting infrastructure like GreenField to provide a highly available DA layer, attempting to compete or cooperate with the Ethereum Layer2 system. In this context where layered expansion has become a trend, whether other public chains will also rush to imitate BNB Chain in supporting their own Layer2 projects remains to be seen, but there is no doubt that a revolutionary paradigm shift in infrastructure is happening and has already occurred with modular blockchains as the overarching direction.