OP Stack + Zero-Knowledge Proofs = The Final Game of L2?

2023-10-09 21:29:41

Collection

For future Layer2 developers, the OP Stack will become a universal Layer2 architecture, allowing developers to flexibly choose between optimistic proofs or zero-knowledge proofs based on the security and effectiveness required by their applications when launching their own Layer2.

Written by: Bill Qian, Yongxin Song, Bonan Yuan, Cypher Capital

The Endgame of Layer2

The competition in the Layer2 space is exceptionally fierce today, with optimistic rollups like Arbitrum, Optimism, and Base leading the charge, while Zero Knowledge Proof (ZKP) rollups such as Scroll, zkSync, Starkenet, Polygon zkEVM, Taiko, and Linea follow closely behind. Although it seems that the Layer2 competition is flourishing, in essence, they all utilize the engineering principle of off-chain computation and on-chain proof. Whether it is the optimistic fraud proof or the ZKP circuit proof, the core difference in engineering practice lies in the different methods of on-chain proof; other principles are actually quite similar. Thus, Optimism has chosen a unique path, namely modular Layer2, logically decoupling various components of Layer2. Once the OPstack realizes the coupling of various Layer2 modules, a seemingly outlandish yet logical idea emerges: ZKP on OP Stack, transforming the OP Stack's challenger from Optimistic Proof to Zero Knowledge Proof, making OpStack a universal Layer2 architecture that supports multiple proofs!

ZKP on OP Stack

The discussion begins with an RFP from Optimism, https://github.com/ethereum-optimism/ecosystem-contributions/issues/61.

Now, let me introduce the questions and development directions proposed in this RFP.

Intent: To implement zero-knowledge proofs to prove Optimism's fault proof program (supported by the instruction set of the Golang compiler).

How to achieve it: Implementing zero-knowledge proofs (ZKP) for the OP chain is a prerequisite for secure and low-latency communication between L2 and L1. A zero-knowledge proof that supports the instruction set can prove Optimism's fault proof program and includes any OP Stack blockchain. Based on the standard execution tracing of the ISA, the fault proof program also introduces additional requirements.

Specifically, the fault proof program introduces the concept of a "pre-image oracle," which uses special system calls to load external data into the program. Each fault proof virtual machine is responsible for implementing a mechanism through which the hash of certain data is placed at a specific location in memory, executing system calls, and then loading the pre-image of that hash into memory for the program's use. The pre-image oracle is also used to bootstrap the program with initial inputs. For more details, please refer to the "pre-image oracle" section in the fault proof program documentation.

In short, this proposal aims to leverage the highly modular characteristics of OP Stack to switch the original optimistic proof-based fault proof method to one that utilizes zero-knowledge proofs. Further specialization is that OP currently compiles GETH into Mini Geth using MIPS, so the zero-knowledge proof of OP Stack can be understood as ZKMips, which is a zero-knowledge proof based on the Mips virtual machine.

Why ZKP?

Currently, OP Stack is running quite well, with optimistic proof-based Optimism and Arbitrum receiving excellent community and developer support. Why does OP Stack still want to explore zero-knowledge proofs? I believe there are several reasons:

OP Stack abstracts the modules of layer2 at a high level; introducing ZKP merely adds a different method of fault proof and does not mean OP will abandon optimistic proof. Developers using OP Stack can freely choose different proof methods.

Optimism and Arbitrum, based on optimistic proof, still do not support fault proof; essentially, OP and ARB are two unverifiable single-chain systems.

The seven-day final confirmation speed of optimistic proof is simply too slow. When ZKP Layer2 occupies the market, the proof speed of ZKP, which can be as fast as 30 minutes, will create a significant advantage, and end users will choose the more secure ZKP Layer2.

Therefore, it is only a matter of time before Optimism supports ZKP. A bold guess is that in the future, OP Stack will support two fault proof systems: optimistic proof and zero-knowledge proof. OP Stack is a universal Layer2 architecture that continues to iterate. To help everyone understand why OP Stack can achieve the switching of different proof systems, the following sections will break down OP Stack in detail.

Core Modules of OP Stack

(This image is taken from Optimism's GitHub)

For OP Stack, the important modules are as follows:

op-node

op-geth

op-batcher

op-proposer

op-program

Cannon

op-challenger

These modules are all independent programs that communicate through standard HTTP interfaces, meaning that if developers want to modify certain features in OP Stack, they only need to modify specific modules to customize their Layer2. The following sections will introduce each OP Stack module and the overall architecture of OP Stack in detail.

op-node

op-node is the most important module in OP Stack. On one hand, as a Sequencer, op-node contains the consensus client implementation of the blockchain, comparable to Lighthouse and Prysm in Ethereum, which sorts transactions submitted by users; on the other hand, as a rollup driver, op-node is responsible for deriving Layer2 chains from L1 block data.

After collecting and sorting user transactions, the Sequencer generates a batch through op-batcher. Before the batch is submitted to L1, to reduce latency in the Rollup network, the Sequencer can generate Layer2 blocks in advance and propagate them through P2P in the Rollup network. Blocks generated directly in L2 are considered unsafe and need to wait for the batch to be submitted to L1 for derivation to be considered safe; however, under normal circumstances (without block reorganization, fraud, etc.), blocks generated directly in L2 are the same as those derived from L1. Exchanges like Binance only wait for a certain number of Layer2 blocks before considering transactions confirmed, without waiting for the batch to be submitted to L1, indicating that the probability of error is extremely low.

The process of deriving L2 blocks is handled by the driver, which continuously tracks the synchronization process between the L1 head block and the L2 chain, retrieves deposit transactions, L2 transaction data, and corresponding receipts from L1, and generates payload attributes, passing them to the execution engine to compute L2 blocks. L2 blocks completely depend on the blocks of the L1 chain; whenever an L1 block containing L2 batches is generated, the L2 chain extends. Additionally, when L1 blocks undergo reorganization, L2 blocks will also be reorganized.

op-geth

op-geth is the execution client implementation of OP Stack, with slight modifications made to go-ethereum to meet the needs of OP Stack. The consensus client op-node drives the execution client op-geth through the Engine API, allowing op-geth to compute output information and generate L2 blocks using payload attributes.

op-batcher

The batcher is the batch submitter, primarily consisting of two tasks. One is to compress L2 sequencer data into batches; the other is to submit batches to L1 so that the verifier can use the data for verification.

The batcher submits batcher transactions to the DA layer, which contain one or more channel frames. A channel consists of a series of sequencer batches to achieve higher compression rates; specifically, the batcher currently uses zlib for data compression. Since the size of a channel may exceed the limit that a batcher transaction can accommodate, the channel is divided into one or more channel frames, and a batcher transaction can include one or more channel frames (which can come from different channels).

source: Optimism

This design provides the batcher with high flexibility, and in the future, OP Stack will support the batcher to use multiple signers to submit multiple channels in parallel.

op-proposer

op-proposer is responsible for submitting the new state commitment (currently in the form of Output Merkle Root) generated after op-geth executes the L2 block to L1. The Output Root does not take effect immediately but must wait for the dispute period to pass before it can be considered finalized.

The above are the parts of OP Stack that have been implemented; the following content related to Fault Proof modules has not yet been completed and is discussed based on the documentation specifications.

The OP Fraud Proof consists of three components:

Program: Given the commitment and dispute of Rollup Inputs (L1 Batch tx Data), it verifies the dispute statelessly (reproducing the same computation process using Inputs provided by PreImageOracle).

VM: Given a stateless Program and Inputs, it tracks any instruction (thus is stateful) and proves it on L1.

Interactive Dispute Game: It binary searches the dispute down to a single instruction, using the VM to resolve this base case.

op-program

op-program is a reference implementation of the Program, developed on the basis of op-node and op-geth, serving as a stateless middleware to verify claims about L2 state transitions.

To verify claims of L2 state, the program first applies L1 data to the finalized L2 state, reconstructing the latest L2 state. This process is similar to the work of op-node. The difference is that op-node retrieves data from RPC and applies state changes to disk, while the Program retrieves data from the Pre Image Oracle and applies state changes to memory. The Program streams data from the Oracle and performs state changes until it reaches EOF or an early termination condition. After reconstructing the L2 state, it returns the verification result based on whether the state matches the claim.

Cannon

Cannon is an implementation of the VM, containing two main components:

Onchain MIPS.sol: EVM implementation to verify the execution of a single MIPS instruction.

Offchain mipsevm: Go implementation to produce a proof for any MIPS instruction to verify on-chain.

The on-chain part, MIPS.sol, implements a big-endian 32-bit MIPS instruction set, simulating a minimal subset of the Linux kernel to support Go programs, but does not include concurrency-related system calls.

The off-chain part, mipsevm, simulates the execution process of MIPS.sol using Go language,

It's Go code

…that runs an EVM

…emulating a MIPS machine

…running compiled Go code

…that runs an EVM

In short, Cannon on-chain runs MINI Geth (the MIPS compilation of GETH) using MIPS in the EVM, which is the Golang version of ETH.

op-challenger

op-challenger is responsible for handling processes related to the dispute game.

A challenge first selects a state root after executing a transaction to issue a dispute, after which the transaction is decomposed into multiple instructions, each producing a new state, forming a state sequence S1, S2, … Sn.

To improve efficiency, both parties in the challenge need to take turns executing steps, divided into Attack and Defend categories.

Attack: The previous state in dispute serves as input, expecting the disputed state as output. The DAG needs to have a commitment to the previous state.

Defend: The disputed state serves as input, and the state after the dispute serves as output. The DAG needs to have a commitment to the subsequent state.

For example, suppose there are 1-9999 instructions producing a state sequence S1-S10000; first, check the 5000th state. If they are the same, perform an attack step, binary search to the left; if different, perform a defend step, binary search to the right.

Ultimately, the dispute is narrowed down to the states before and after a single instruction, which is then handed over to the VM for state verification of that single instruction.

Workflow of the Modules

Normal Process (excluding Challenge)

source: Cypher Capital

Users submit transactions, either through L2 RPC on L2 or directly on L1 (bypassing op-batcher, which has stronger censorship resistance and can serve as an emergency escape mechanism).

The RPC server started by op-node receives the transaction, sorts it, and sends it to op-batcher and op-geth.

op-batcher compresses the sequenced transactions into a batch and submits it to the DA layer (L1).

op-geth executes the sequenced transactions and passes the new state to op-proposer.

op-proposer sends the L2 output root as a commitment to the L2 state to L1 for storage, and once the challenge period ends, the state is considered finalized.

The driver in op-node retrieves transaction data and other information from L1, deriving the canonical L2 block. L2 blocks derived from batch tx in finalized L1 blocks are considered finalized, while L2 blocks derived from batch tx in L1 blocks that are confirmed but not finalized are considered safe. To reduce latency, L2 blocks generated directly in L2 can be propagated in advance through P2P and are considered unsafe.

Challenge Process

source: Optimism

Users initiate an interactive dispute game.

Cannon (VM) runs op-program (written in Go) on the MIPS virtual machine to track the state changes at each step of execution.

op-program reproduces the computation process of the L2 state using the commitment of Rollup Inputs provided by PreImageOracle, records the execution trace, and verifies the dispute statelessly.

op-challenger uses binary search to narrow the dispute down to a single instruction.

Cannon generates proofs for the state changes before and after executing that instruction, which are verified on the smart contract MIPS.sol on L1.

OP Stack + ZKP

Based on the above process introduction, we can easily find that the coupling degree of the Challenge module with other modules is very low, and its impact on the basic transaction process is minimal, only requiring the intervention of the Challenge module in the case of fraudulent behavior (which has not occurred since the OP Mainnet went live in December 2021).

To shorten the current seven-day exit confirmation time of Optimism and provide more modular options for OP Stack, Optimism actively embraces ZKP technology, hoping to bring ZKP that can prove Optimism's fault proof program and support well-known ISA to OP Stack. The proposals from O(1) Labs and Risc-0 teams have passed the Foundation Mission (RFP) Application.

O(1) Labs Proposal

source: O(1) Labs

As the development team of Mina Protocol, O(1) Labs plans to use Kimchi, which is adopted by Mina Protocol, as the proof system for the MIPS VM, with only minor modifications.

Kimchi is a Halo2-like PLONKish system currently configured with an inner-product-argument style polynomial commitment scheme. It supports verifiable computation using traditional Turing-machine-based instruction sets.

The backend of Kimchi is interchangeable; the current implementation is defined on Pasta curves using an inner-product-argument-based polynomial commitment scheme (Pasta-IPA), which is incompatible with the cryptographic system used in EVM, resulting in high verification costs on EVM. Therefore, O(1) Labs plans to change Pasta-IPA to a KZG commitment scheme using the bn128 curves (bn128-KZG), which can utilize existing EVM precompiles for higher efficiency.

The original input fault proof MIPS system's input is now fed into the bn128-kzg Kimchi system, ZK-Prove execution path. The pre-image system call continues to use OP Stack's Cannon, and the final proof is sent to the smart contract on L1, updating the state upon successful verification.

RISC Zero Proposal

The RISC Zero team plans to continue using the currently implemented Groth16 backend based on RISC-V ISA zkVM (augmented with accelerated co-processors for common cryptographic tasks including hashing and ECDSA signature verification) and modify the Ethereum ZK Client based on Reth to further adapt to Optimism, implementing L1-L2 derivation logic in zkVM to prove that the transaction sequence is generated by the Optimism sequencer.

The ZK Client consists of two parts: the zkVM guest program and the host library, analogous to op-program and cannon in the O(1) Labs proposal. The zkVM guest program is responsible for computing state transitions, while the host library retrieves the data needed for the computation data transfer, coordinates the execution of the zkVM guest program, and generates the zk proof for the transaction execution state transition.

|------------------------|------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------| | | O(1) Labs | RISC Zero | | Performance | 44B MIPS instruction-steps for an Optimism block | \<2B zkVM cycles for an Optimism block | | Latency | 2 days for an Optimism block without parallelization | 10-20min for an Optimism block with parallelization | | Complexity | Kimchi: 35kloc change 5-10kloc | ZKP framework: 10kloc of Rust zkVM: 54kloc of Rust | | Robustness | Mina has been securing by Kimchi for 2 years. | Extensive automated testing | | Security | kzg-bn128-based EVM-friendly snark system requires trusted setup. | The zkVM emits STARK-based proofs that require no trusted setup. On-chain verification is based on STARK->SNARK conversion. | | OP Stack Compatibility | No fundamental change | No change |

Exploring the Possibility of ZKP in OP Stack

Currently, a team called ZKM has implemented ZKMIPs' EVM, which translates EVM into MIPs instruction set and performs zero-knowledge proofs. The feedback so far has been that it is very slow but usable. https://ethresear.ch/t/zkmips-a-zero-knowledge-zk-vm-based-on-mips-architecture/16156

Considering that Mina and Risc0 both have relatively mature development experience, we have reason to believe that OP Stack supporting ZKP is just a matter of time. However, at the same time, considering that the ZKP development of OP Stack started relatively late and is not natively supported, the future performance remains unpredictable.

OP Stack, A Universal Architecture for Layer2

OP Stack has gained adoption from many well-known teams due to its excellent code implementation, tolerant open-source protocol, and modular architecture design. The only criticism it faces is that the deterministic time of the optimistic rollup technology it employs is too long, and its technological advancement is not as good as ZK rollup. Now, with the help of third-party professional teams, OP Stack has begun its attempt towards a ZKP future. Given that OP Stack currently does not support Fault Proof, it is possible that OP Stack may skip the Fault Proof stage and directly use ZK Proof to achieve faster determinism and higher security.

For future Layer2 developers, OP Stack will become a universal Layer2 architecture, allowing developers to flexibly choose between optimistic proof or zero-knowledge proof based on the security and effectiveness required by their applications. It can be anticipated that Layer2 based on optimistic proof will be cheaper, while Layer2 based on zero-knowledge proof will be more secure.

reference: https://blog.oplabs.co/building-a-fault-proof-system/