Scroll founder Zhang Ye: zkEVM design, optimization, and application

2023-04-20 12:10:49

Collection

Scroll is building a general scaling solution for Ethereum using zkRollup, employing very advanced arithmetic circuits and proof systems, and constructing fast validators with hardware acceleration to prove recursion.

Original Author: Ye Zhang

Original Source: Scroll CN

Compiled by: F.F

In the latest ZKP Mooc course, Ye Zhang, co-founder of Scroll, delivered a talk on the design, optimization, and application of zkEVM. Scroll is building an Ethereum-equivalent ZK-Rollup, compatible at the bytecode level, directly supporting all existing tools.

Below is the transcript of the video.

The talk is divided into four parts. In the first part, Ye Zhang introduced the development background and explained why we need zkEVM and why it has become so popular in the past two years. The second part explained how to build zkEVM from scratch through a complete process, including arithmetic and proof systems. The third part discussed the challenges Scroll faced while building zkEVM through some interesting research questions. Finally, the last part introduced some other applications using zkEVM.

Application

Background and Motivation

Application

Traditional Layer 1 blockchains have some nodes that jointly maintain the network through a P2P network. When they receive user transactions, they execute them within the EVM virtual machine, read the called contracts and storage, and update the global state tree according to the transactions.

Application

The advantage of this architecture lies in decentralization and security, while the drawback is that transaction fees on L1 are expensive, and transaction confirmations are slow.

Application

In the architecture of ZK-Rollup, the L2 network only needs to upload data and proofs of data validity to L1, where the proofs are computed through zero-knowledge proof circuits.

Application

In early ZK-Rollups, the circuits were designed for specific applications, requiring users to send transactions to different provers, and then the ZK-Rollups of different applications would submit their data and proofs to L1. This led to the problem of losing the composability of the original L1 contracts.

Application

What Scroll aims to do is a native zkEVM solution, which is a general-purpose ZK-Rollup. This is not only more user-friendly but also allows developers to gain a development experience similar to that on L1. Of course, the development difficulty behind this is immense, and the current cost of proof generation is also very high.

Application

Fortunately, the efficiency of zero-knowledge proofs has significantly improved over the past two years, which is why zkEVM has become so popular recently. There are at least four reasons that make it feasible: first, the emergence of polynomial commitments, which under the original Groth16 proof system had a very large constraint size, while polynomial commitments can support higher-order constraints, reducing proof size; second, the emergence of lookup tables and custom gates, which can support more flexible designs, making proofs more efficient; third, breakthroughs in hardware acceleration, where GPUs, FPGAs, and ASICs can reduce proof times by 1-2 orders of magnitude; and fourth, under recursive proofs, multiple proofs can be compressed into one, making the proof smaller and easier to verify. Therefore, combining these four factors, the generation efficiency of zero-knowledge proofs is three orders of magnitude higher than it was two years ago, which is also the origin of Scroll.

Application

According to Justin Drake's definition, zkEVM can be divided into three categories. The first category is language-level compatibility, mainly because EVM was not designed for ZK, and there are many opcodes that are not friendly to ZK, leading to a lot of overhead. Therefore, Starkware and zkSync choose to compile Solidity or Yul into ZK-friendly compilers at the language level.

The second category is the bytecode-level compatibility that Scroll is working on, which directly proves whether the EVM bytecode is processed correctly, inheriting Ethereum's execution environment. Some trade-offs that can be made here include using a different state root than EVM, such as using ZK-friendly data structures. Hermez and Consensys are also doing similar things.

The third category is consensus-level compatibility, where the trade-off is that not only does the EVM need to remain unchanged, but also the storage structure and other implementations must be fully compatible with Ethereum, at the cost of longer proof times. Scroll is collaborating with the Ethereum Foundation's PSE team to achieve Ethereum's ZKification.

Application

Building zkEVM from 0 to 1

Application

In the second part, Ye Zhang demonstrated how to build zkEVM from scratch.

Complete Process

First, in the frontend part of ZKP, you need to represent your computation through mathematical arithmetic, the most commonly used being linear R1CS, as well as Plonkish and AIR. After obtaining the constraints through arithmetic, in the backend of ZKP, you need to run the proof algorithm to prove the correctness of the computation. Here, the most commonly used are Polynomial Interactive Oracle Proofs (Polynomial IOP) and Polynomial Commitment Schemes (PCS).

Application

Here, we need to prove zkEVM, and Scroll uses a combination of Plonkish, Plonk IOP, and KZG.

Application

To understand why we use these three solutions, we start with the simplest R1CS. The constraints in R1CS are linear combinations multiplied by linear combinations equal to a linear combination. You can add any linear combination of variables without additional overhead, but the maximum degree in each constraint is 2. Therefore, for operations with higher degrees, more constraints are required.

Application

In Plonkish, you need to fill in all variables in a table, including inputs, outputs, and witness for intermediate variables. On top of this, you can define different constraints. There are three types of constraints that can be used in Plonkish.

Application

The first type of constraint is Custom Gate, where you can define polynomial constraint relationships between different cells, for example, va3 * vb3 * vc3 - vb4 = 0. Compared to R1CS, the degree can be higher because you can define constraints for any variable and can define very different constraints.

Application

The second type of constraint is Permutation, which is used for equality checks. It can be used to check the equivalence of different cells, commonly used in associative circuits between different gates, such as proving that the output of the previous gate equals the input of the next gate.

Application

The last type of constraint is Lookup Table. We can understand the lookup table as a relationship between variables that can be represented as a table. For example, if we want to prove that vc7 is within the range of 0-15, in R1CS, you would first need to decompose this value into 4 bits of binary and then prove that each bit is within the range of 0-1, which would require four constraints. In Plonkish, you can list all possible ranges in the same column and only need to prove that vc7 belongs to that column, which is very efficient for range proofs, and in zkEVM, lookup tables are very useful for proving memory read and write.

Application

To summarize, Plonkish supports custom gates, equality checks, and lookup tables, which can flexibly meet the needs of different circuits. A simple comparison with STARK shows that in STARK, each row is a constraint, and constraints need to represent state transitions between rows, but the flexibility of custom constraints in Plonkish is clearly higher.

Application

The current question is how we choose the frontend in zkEVM. There are four main challenges for zkEVM. The first challenge is that the EVM field is 256 bits, which means efficient range constraints on variables are needed; the second challenge is that there are many ZK-unfriendly opcodes in EVM, so a large-scale constraint is needed to prove these opcodes, such as Keccak-256; the third challenge is the memory read and write issue, where you need effective mappings to prove that what you read is consistent with what was previously written; the fourth challenge is that the execution trace of EVM is dynamically changing, so we need custom gates to adapt to different execution traces. Considering the above, we chose Plonkish.

Application

Next, we look at the complete process of zkEVM. Based on the initial global state tree, when a new transaction comes in, the EVM will read the storage and the bytecode of the called contracts, generating the corresponding execution trace according to the transaction, such as PUSH, PUSH, STORE, CALLVALUE, and then gradually execute to update the global state, obtaining the global state tree after the transaction. The zkEVM takes the initial global state tree, the transaction itself, and the global state tree after the transaction as inputs to prove the correctness of the execution trace according to the EVM specifications.

Application

Delving into the details of the EVM circuit, each step of the execution trace has corresponding circuit constraints. Specifically, each step's circuit constraints include Step Context, Case Switch, and Opcode Specific Witness. Step Context includes the codehash corresponding to the execution trace, remaining gas, and counter; Case Switch includes all opcodes, all error cases, and the corresponding operations for that step; Opcode Specific Witness includes the additional witness required for the opcode, such as operands.

Application

For a simple addition example, it needs to ensure that the control variable sADD for the addition opcode is set to 1, while all other opcode control variables are zero. In Step Context, by setting gas' - gas - 3 = 0, it constrains the consumed gas to equal 3, similarly constraining the counter, and the stack pointer increments by 1 after that step; in Case Switch, it constrains that this step is an addition operation by ensuring the sum of opcode control variables equals 1; in Opcode Specific Witness, it constrains the actual addition of the operands.

Application

Additionally, extra circuit constraints are needed to ensure the correctness of operands read from memory. Here, we first need to construct a lookup table to prove that the operands belong to memory. And verify the correctness of the memory table through the memory circuit (RAM Circuit).

Application

The same method can be applied to ZK-unfriendly hash functions, constructing a lookup table for the hash function, mapping the hash inputs and outputs in the execution trace to the lookup table, and using an additional hash circuit (Hash Circuit) to verify the correctness of the hash lookup table.

Application

Now let's look at the circuit architecture of zkEVM. The core EVM circuit is used to constrain the correctness of each step of the execution trace. In some places where the constraints of the EVM circuit are more challenging, we use lookup tables to map, including Fixed Table, Keccak Table, RAM Table, Bytecode, Transaction, Block Context, and then use separate circuits to constrain these lookup tables, for example, the Keccak circuit is used to constrain the Keccak table.

Application

To summarize, the complete workflow of zkEVM is shown in the figure below.

Application

Proof System

Because directly verifying the above EVM circuits, memory circuits, storage circuits, etc., on L1 incurs huge overhead, Scroll's proof system adopts a two-layer architecture.

The first layer is responsible for directly proving the EVM itself, requiring a lot of computation to generate proofs. Therefore, the first layer proof system requires support for custom gates and lookup tables, is friendly to hardware acceleration, can generate computations in parallel under low peak memory, and has a small verification circuit size for quick verification. Promising optional solutions include Plonky2, Starky, eSTARK, which basically use Plonk in the frontend but may use FRI in the backend, and all meet the four characteristics mentioned above. Another class of optional solutions includes Halo2 developed by Zcash and the KZG version of Halo2.

There are also some new proof systems that are very promising, such as HyperPlonk, which recently removed FFT, and the NOVA proof system, which can achieve smaller recursive proofs. However, they currently only support R1CS in research, and if they can support Plonkish in the future and be applied in practice, they will be very practical and efficient.

Application

The second layer proof system is used to prove the correctness of the first layer proof, needing to be efficiently verifiable in EVM. Ideally, it should also be friendly to hardware acceleration and support transparent or universal setup. Promising optional solutions include Groth16 and Plonkish proof systems with fewer columns. Groth16 remains a representative of extremely high proof efficiency in current research, while Plonkish proof systems can also achieve high proof efficiency with fewer columns.

Application

At Scroll, we use the Halo2-KZG proof system in both layers of the proof system. Because Halo2-KZG can support custom gates and lookup tables, performs well under GPU hardware acceleration, and has a small verification circuit size for quick verification. The difference is that we use Poseidon hash in the first layer proof system to further improve proof efficiency, while the second layer proof system still uses Keccak hash for direct verification on Ethereum. Scroll is also exploring the possibility of a multi-layer proof system to further aggregate the aggregated proofs generated by the second layer proof system.

Application

Under the current implementation, Scroll's first layer proof system EVM circuit has 116 columns, 2496 custom gates, 50 lookup tables, a maximum degree of 9, and requires 2^18 rows for 1M Gas; while the second layer proof system's aggregated circuit has only 23 columns, 1 custom gate, 7 lookup tables, a maximum degree of 5, requiring 2^25 rows to aggregate the EVM circuit, memory circuit, and storage circuit.

Application

Scroll has also conducted extensive research and optimization work on GPU hardware acceleration. For the EVM circuit, the optimized GPU prover only takes 30 seconds, improving efficiency by 9 times compared to the CPU prover; for the aggregated circuit, the optimized GPU prover only takes 149 seconds, improving efficiency by 15 times compared to the CPU. Under the current optimization conditions, the first layer proof system for 1M Gas takes about 2 minutes, while the second layer proof system takes about 3 minutes.

Application

Interesting Research Questions

Application

In the third part, Ye Zhang discussed some interesting research questions Scroll encountered while building zkEVM, from the frontend arithmetic circuits to the implementation of the prover.

Circuit

The first interesting research question in circuits is randomness. Since the EVM field is 256 bits, we need to split it into 32 8-bit fields to perform range proofs more efficiently. We then use the Random Linear Combination (RLC) method to encode the 32 fields into one using random numbers, allowing us to verify the original 256-bit field by verifying just this field. However, the problem is that the generation of random numbers needs to occur after the field splitting to ensure they are not tampered with.

Thus, Scroll and the PSE team proposed a multi-stage prover scheme to ensure that after field splitting, random numbers are used to generate RLC, which is encapsulated in the Challenge API. RLC has many applications in zkEVM, as it can compress the EVM field into one field, encrypt variable-length inputs, or optimize the layout of lookup tables, but there are still many open questions that need to be resolved.

Application

The second interesting research question regarding circuits is circuit layout. Scroll chose Plonkish for the frontend because the execution trace of EVM is dynamically changing, requiring support for different constraints and varying equivalence checks, while the standardized gates of R1CS require a larger circuit size to implement.

However, Scroll currently uses over 2000 custom gates to meet the dynamically changing execution trace and is exploring how to further optimize circuit layout, including splitting opcodes into Micro Opcodes or reusing cells within the same table.

Application

The third interesting research question regarding circuits is dynamic scaling. Because different opcodes have different circuit sizes, to meet the dynamically changing execution trace, each opcode at each step needs to satisfy the maximum circuit size, such as Keccak hash, which incurs additional overhead. If we could make zkEVM dynamically adapt to the changing execution trace, it would save unnecessary overhead.

Application

Prover

Regarding the prover, Scroll has already optimized MSM and NTT extensively for GPU acceleration, but the current bottleneck has shifted to witness generation and data copying. Assuming that MSM and NTT occupy 80% of the proof time, even if hardware acceleration can improve the efficiency of this part by several orders of magnitude, the original 20% of proof time for witness generation and data copying will become the new bottleneck. Another issue with the prover is the need for a large amount of memory, so cheaper and more decentralized hardware solutions need to be explored.

Application

At the same time, Scroll is also exploring hardware acceleration and proof algorithms to improve prover efficiency. Currently, there are two main directions: either switching to smaller fields, such as using a 64-bit Goldilocks field or a 32-bit Mersenne Prime, or sticking to new proof systems based on elliptic curves (EC), such as SuperNova. Of course, there are other possible paths, and friends with ideas are welcome to contact Scroll directly.

Application

Security

When building zkEVM, security is crucial. The zkEVM jointly built by PSE and Scroll has about 34,000 lines of code. From a software engineering perspective, it is impossible for such a complex codebase to be free of vulnerabilities for a long time. Scroll is currently undergoing extensive audits, including by the top auditing firms in the industry, to review the zkEVM codebase.

Application

Other Applications Using zkEVM

Application

The fourth part discusses other applications that use zkEVM.

In the architecture of zkRollup, we verify the validity of n transactions on L2 through smart contracts on L1.

Application

If we directly verify L1 blocks, then L1 nodes do not need to re-execute transactions; they only need to verify the validity of each block proof. This architectural solution is called Enshrine Blockchain. Currently, directly implementing this on Ethereum is very challenging because it requires verifying the entire Ethereum block, which includes verifying a large number of signatures, leading to longer proof times and lower security. However, some other public chains have already implemented recursive proofs using a single proof to verify the entire blockchain, such as Mina.

Application

Since zkEVM can prove state transitions, it can also be utilized by white hats to prove they know certain vulnerabilities in smart contracts and seek bounties from project parties.

Application

The last use case is to use zero-knowledge proofs to prove claims about historical data, serving as an oracle. Currently, Axiom is working on products in this area. At the recent ETHBeijing hackathon, the GasLockR team utilized this feature to prove historical gas expenditures.

Application

Finally, Scroll is building a general-purpose Ethereum scaling solution for zkRollup, using very advanced arithmetic circuits and proof systems, and constructing fast verifiers through hardware acceleration to prove recursion. The Alpha testnet has already been launched and has been running stably for a long time.

Of course, there are still some interesting questions to solve, including protocol design and mechanism design, zero-knowledge engineering, and practical efficiency. Everyone is welcome to join Scroll to build together!

Application