Scroll Research: Design Challenges and Solutions of zkEVM

Scroll
2022-05-03 16:38:09
Collection
This article discusses the design challenges and feasibility of zkEVM, and presents a detailed plan for building zkEVM from scratch.

Source: Scroll Tech

Compiled by: Biscuit, Chain Catcher

Overview

zk-Rollup is a very cheap and secure Layer 2 scaling solution for Ethereum. However, existing zk-Rollups are limited to specific applications, making it difficult for developers to build general-purpose composable DApps and migrate existing applications to zk-Rollups. We build a fully EVM-compatible zk-Rollup by introducing zkEVM to generate zk proofs for EVM verification, allowing any Ethereum application to easily migrate to this zk-Rollup.

This article discusses the design challenges and feasibility of zkEVM and presents a detailed plan for building zkEVM from scratch.

Background

zk-Rollup is widely recognized as the best scaling solution for Ethereum. It not only has the security of Ethereum Layer 1 but also offers the fastest transaction speeds compared to all other Layer 2 solutions.

In the medium to long term, ZK rollups will win out in all use cases as ZK-SNARK technology improves. --- Vitalik Buterin

The basic idea of zk-Rollup is to aggregate a large number of transactions into a single Rollup block and generate succinct proofs for off-chain blocks. The smart contract on Layer 1 only needs to verify the zk-Rollup's proof and directly update the state without re-executing those transactions. Verifying proofs is much cheaper in gas fees than re-executing computations, and data compression (i.e., retaining only the minimal on-chain data for verification) helps reduce gas costs, saving an order of magnitude in gas fees.

Despite the high security and efficiency of zk-Rollup, it is challenging to build general-purpose DApps, and its applications are still limited to payments and swaps for two main reasons:

First, developing DApps in zk-Rollup requires using a special programming language (i.e., R1CS) to write the logic of smart contracts. The syntax of this programming language is complex, and developers must be proficient in zero-knowledge proofs.

Second, current zk-Rollups do not support composability[1]. This means multiple zk-Rollup applications cannot interact with each other within Layer 2, significantly reducing the composability of DeFi applications.

In short, current zk-Rollups are not developer-friendly and have limited functionality. We aim to address these issues by directly supporting native EVM verification, providing developers with a fast development experience, and supporting composability of applications within Layer 2 so that existing Ethereum applications can easily migrate to zk-Rollup.

Building General-Purpose DApps in zk-Rollup

There are two methods for building general-purpose DApps in zk-Rollup.

  • Build dedicated circuits ("ASIC") for different DApps
  • Build a general "EVM" encoding for smart contract execution

The term "circuit" refers to the program representation used in zero-knowledge proofs. For example, to prove hash(x) = y, one would need to rewrite the hash function using an ASIC circuit. Circuits only support a very limited set of computational expressions (e.g., R1CS only supports addition (add) and multiplication (mul)). Therefore, the process of writing programs using circuit languages is very difficult, as all program logic (including if-else, loops, etc.) must be constructed using add and mul.

The first method requires developers to design dedicated "ASIC" circuits for different DApps, which necessitates using the most primitive form of zero-knowledge proofs. By designing custom circuits, the cost for each DApp can be reduced. However, since circuits are "static," they cannot provide composability for applications and require specialized circuit design knowledge, leading to a poor developer experience[2].

The second method does not require any special design or circuit expertise. This high-level idea based on machine proofs allows any program to run on a CPU, so developers only need to build a general CPU circuit to verify the underlying steps. This CPU circuit is then used to verify the execution of the program. In this scenario, the program refers to the smart contract, and the CPU is the EVM. However, due to high costs, this approach has not been widely adopted in recent years. For instance, even if a developer only wants to add a verification step, they must bear the cost of the entire EVM circuit. If there are thousands of steps in the execution trace, the cost of the EVM circuit will be 1000 times[3].

Recently, there has been a lot of research optimizing zk proofs according to these two methods, including:

(i) Proposing a new zero-knowledge proof-friendly language Poseidon hash, which is 100 times more efficient in circuits than SHA256.

(ii) General verifiable virtual machines, such as TinyRAM.

(iii) An increasing number of general optimization techniques, such as Plookup, and faster cryptographic libraries.

We previously suggested designing "ASIC" circuits for each DApp and communicating through cryptographic verification. However, based on community feedback, we have shifted our priorities to focus on the second method, prioritizing the construction of a general EVM circuit (the so-called "zkEVM"). zkEVM will support the same development experience as Layer 1. We will not leave the complexity of the underlying design to developers but will address efficiency issues through custom EVM circuit design.

Design Challenges of zkEVM

Building zkEVM is challenging; unlike TinyRAM, the design and implementation of zkEVM are more complex for the following reasons:

First, EVM has limited support for elliptic curves. Currently, EVM only supports BN254 pairing and does not directly support cyclic elliptic curves, making recursive proofs difficult. Other specialized protocols are also hard to use under this limitation unless they are EVM-compatible verification algorithms.

Second, EVM has a word size of 256 bits. EVM operates on 256-bit integers (most conventional VMs operate on 32-64 bit integers), while zk proofs work over finite fields. Performing "mismatched field arithmetic" internally in the circuit requires range proofs, which will add approximately 100 constraints to each EVM step, causing the size of the EVM circuit to increase by two orders of magnitude.

Third, EVM has many special opcodes. Unlike traditional VMs, EVM has many special opcodes, such as CALL, as well as error types related to execution context and gas. This presents new challenges for circuit design.

Fourth, EVM is a stack-based virtual machine. SyncVM (zksync) and Cario (starkware) architectures are based on a register model and define specific IR/AIR. A specialized compiler is needed to compile smart contract code into a new zk-friendly IR. This approach is language-compatible rather than natively EVM-compatible, making it more difficult to implement a stack-based model with direct support for native toolchains.

Fifth, the cost of Ethereum's storage layout is too high. Ethereum's storage layout heavily relies on Keccak and MPT[4], both of which are not zk-friendly types and incur high costs. For example, the circuit size for Keccak hash is 1000 times that of Poseidon hash. However, replacing Keccak with another hash would create some compatibility issues with existing Ethereum infrastructure.

Sixth, machine-based proofs have high costs. Even if all the above issues can be properly addressed, an effective way to combine them into a complete EVM circuit still needs to be found. As I mentioned in the previous section, a simple opcode like add could lead to the entire EVM circuit's cost.

Why is zkEVM potentially achievable?

Thanks to the significant progress made by researchers in this area, many efficiency issues have been resolved in the past two years, proving the feasibility of zkEVM! The main technical advancements come from several aspects:

The use of polynomial commitments. In recent years, most zero-knowledge proof protocols have been using R1CS, and PCP queries have been encoded into application-specific trusted setups. This situation often causes circuits to exceed their load and prevents custom optimizations, as the degree of each constraint needs to be 2 (bilinear pairing only allows one multiplication in the exponent). Developers using polynomial commitment schemes can elevate constraints to any degree through universal or transparent setups, greatly enhancing the flexibility of backend choices.

Data table query parameters and custom gadgets. Optimizing data table query parameters was first proposed in Arya and then optimized in Plookup. This can save a lot of bitwise operations for zk-unfriendly programming statements (e.g., AND, XOR, etc.). Custom gadgets allow developers to efficiently perform highly constrained operations. TurboPlonk and UltraPlonk define elegant syntax to make it easier for developers to use query data tables and custom gadgets. This is very helpful for reducing the cost of EVM circuits.

The feasibility of recursive proofs is increasing. In the past, recursive proofs relied on special cyclic elliptic curves (i.e., constructions based on MNT curves), which required significant costs. More technologies are currently changing this dependency without sacrificing efficiency. For example, Halo can use pairing-friendly elliptic curves and special inner product parameters to amortize recursive costs. Aztec shows that it is possible to aggregate proofs directly on existing protocols (query data tables can reduce the cost of non-native field operations, thereby lowering the cost of verification circuits). This method can greatly enhance the scalability of circuit load.

Hardware acceleration is becoming more efficient. We have manufactured GPU and ASIC/FPGA accelerators, and papers on ASIC provers have been accepted by the largest computer conference (ISCA). GPU provers are about 5 to 10 times faster than Filecoin's implementation. This will significantly improve the computational efficiency of provers.

How does zkEVM work, and how do we build it?

In addition to strong intuition and technical improvements, we also need to be clearer about what we need to prove and develop a more specific architecture. We will introduce more technical details and comparisons in subsequent articles. Here, we describe the overall workflow and some key ideas.


Workflow for Developers and Users

Developers can use any EVM-compatible language to run smart contracts and deploy the compiled bytecode on Scroll. Then, users can send transactions to interact with the smart contracts. The experience for users and developers will be exactly the same as Layer 1. However, gas fees will be significantly lower, and transaction orders on Scroll will be instantaneously pre-confirmed (withdrawals can be completed in just a few minutes).

Workflow of zkEVM

Even though the apparent workflows of Layer 1 and Layer 2 do not differ much, their underlying processing processes are completely different: Layer 1 relies on the re-execution of smart contracts; Layer 2 relies on the validity proof of the zkEVM circuit.

Let’s explain in more detail how transactions in Layer 1 and Layer 2 differ.

In Layer 1, the bytecode of the smart contract is stored in Ethereum storage, and transactions will be broadcast in the P2P network. For each transaction, every full node needs to load the corresponding bytecode and execute it on the EVM to reach the same state (the transaction will serve as input data).

The bytecode of Layer 2 smart contracts is also operated in the same way, but the subsequent step is that transactions will be sent off-chain to a centralized zkEVM node. The zkEVM then executes the bytecode and generates a succinct proof to demonstrate that the state has been correctly updated after the application transaction. Finally, the Layer 1 contract will verify the proof and update the state without re-executing the transaction.

Let’s delve into the execution process to see what zkEVM ultimately needs to prove. In native execution, the EVM first loads the bytecode and then sequentially executes the opcodes in the bytecode from scratch, performing the following three sub-steps for each opcode:

(i) Read elements from the stack, memory, or storage

(ii) Perform some calculations on these elements

(iii) Write the results back to the stack, memory, or storage[5]. For example, the add opcode needs to read two elements from the stack, add them, and write the result back to the stack.

Thus, the proof for zkEVM needs to include the following aspects corresponding to the execution process:

  • The bytecode is correctly loaded from storage (so that the virtual machine can correctly execute the opcode loaded from the given address)
  • The opcodes in the bytecode are executed one by one (the bytecode is executed in order, without losing or skipping any opcodes)
  • Each opcode is executed correctly (the three sub-steps in each opcode are correctly executed for reading, writing, and calculating)

Design Highlights of zkEVM

When designing the architecture of zkEVM, we need to address/solve the three issues mentioned above.

First, design a circuit for cryptographic verifiers. This part acts like a "verifiable storage," ensuring the correctness of verification results through the use of cryptographic verifiers[6]. For example, in a Merkle tree, the deployed bytecode will be stored as leaf nodes in the Merkle tree. The verifier can then use succinct proofs to verify the bytecode loaded from a given address (i.e., verifying the Merkle path in the circuit). For Ethereum storage, the circuit needs to be compatible with Merkle Patricia Trie and the Keccak hash function.

Second, design a circuit to associate the bytecode with actual execution. Transferring bytecode to a static circuit introduces a problem: conditional opcodes like jump (corresponding to loops and if-else statements in smart contracts) may jump to any location. The destination of the jump is uncertain until someone runs the bytecode with specific inputs. This is why we need to verify the actual execution trace. The execution trace can be thought of as "unfolded bytecode," containing opcodes arranged in the order of actual execution (i.e., if you jump to another location, the trace will include that target opcode and position).

The prover will directly provide the execution trace as witness data for the circuit. We need to prove that this execution trace is "unfolded" from specific bytecode using specific inputs, with the aim of enforcing consistency in the value of the program counter. To address the uncertainty of the destination, the solution is to have the prover provide all necessary data. Then, consistency can be efficiently checked through lookup parameters (i.e., proving that the opcode with the accurate global counter is included in the "bus").

Third, design circuits for each opcode (proving that the reading, writing, and calculations in each opcode are correct). This is the most critical part, proving that each opcode in the execution trace is correct and consistent. If everything is directly combined, it will incur high costs. The optimization plan here is:

1) We will separate reading, writing, and calculations into two proofs. One proof will place all elements used by the opcodes onto the "bus," while the other proof will demonstrate that the calculations on the elements on the "bus" are executed correctly. This will significantly reduce the cost for each part (when generating the calculation proof, the entire EVM storage does not need to be considered). The former is called "state proof," and the latter is called "EVM proof." Another finding is that lookup declarations can effectively handle "bus mapping."

2) We can design higher-degree customized constraints for each opcode (i.e., we can split an EVM word into multiple data chunks for more efficient processing). We can choose whether to "open" a constraint based on demand through a selector polynomial. This can avoid incurring the cost of the entire EVM circuit for each operation.

This architecture was initially proposed by the Ethereum Foundation and is still in its early stages, actively being developed. We are closely collaborating with the Ethereum Foundation to find the best way to implement this EVM circuit. So far, we have defined the most important features of the EVM circuit and implemented some opcodes (using the UltraPlonk syntax in the Halo2 library). More detailed content will be introduced in subsequent articles. We encourage interested readers to read this document. The development process will be transparent. This will be a fully open-source design that harnesses the collective strength of the entire community. We hope more people will join in and contribute.

What else can zkEVM bring?

zkEVM is far more than just Layer 2 scaling. We can understand it as a direct way to extend Ethereum Layer 1 through Layer 1 validity proofs. This means that existing Layer 1 can be scaled without any special Layer 2.

For example, developers can use zkEVM as a full node, and the proof can be used to directly prove the transitions between existing states. All Layer 1 transactions do not need to migrate anything to Layer 2; you can prove directly! More broadly, you can use zkEVM to generate succinct proofs for the entire Ethereum, just like Mina. The only thing that needs to be added is proof recursion (embedding the verification circuit of blocks into zkEVM)[7].

Conclusion

zkEVM can provide developers and users with the same experience. Without sacrificing security, it is several orders of magnitude cheaper. An architecture has been proposed to build it in a modular way. It leverages recent breakthroughs in zero-knowledge proofs to reduce costs (including custom constraints, lookup parameters, proof recursion, and hardware acceleration). We look forward to seeing more people join the zkEVM community and brainstorm with us!

Notes:

[1]: Starkware announced on September 1, 2021, that it has achieved composability.

[2]: Circuits are fixed and static. For example, when implementing a program as a circuit, you cannot use variable upper limit loops. The upper limit must be fixed to a maximum value. Circuits cannot handle dynamic logic.

[3]: For the sake of reader understanding, we elaborate on the cost of the EVM circuit here. As mentioned earlier, circuits are fixed and static. Therefore, the EVM circuit needs to include all possible logic (this volume is 10,000 times that of a circuit containing only add). This means that even if you only want to prove add, you still need to bear the cost of all logic that may be included in that EVM circuit. In other words, the cost is magnified by 10,000 times. In the execution trace, you need to prove a series of opcodes, and each opcode incurs high costs.

[4]: The EVM itself is not tightly bound to the Merkle-Patricia Tree (MPT). Currently, MPT is only used for storing Ethereum states. It is easy to change (some have proposed using Verkle trees to replace MPT).

[5]: This is a highly simplified abstract concept. Technically, the list of "EVM states" is longer, including the program counter, gas remaining, call stack (all of the above plus the address and static of each call in the stack), a set of logs, and transaction range variables (hot storage slots, refunds, self-destruct). We can also introduce identifiers for different calling environments to directly support composability.
[6]: Due to the large storage volume, we use accumulators for storage. Memory and stack can use editable Plookup (we can effectively implement "RAM" this way).

[7]: Adding a complete recursive proof into the zkEVM circuit is not an easy task. The best way to implement recursion is still to use cyclic elliptic curves (i.e., Pasta curves). We need to introduce some kind of "wrapping" process to make recursion verifiable on Ethereum Layer 1.

Related tags
ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
ChainCatcher Building the Web3 world with innovators