Ethereum Researcher: Native rollups — superpowers from L1 execution

01-23

This article is machine translated

Show original

Here is the English translation of the text, with the specified translations applied:

Author: Justin Drake, Ethereum Researcher, ethresearch; Compiled by Tao Zhu, Jinse Finance

The credit for this article goes to the broader Ethereum R&D community. Key contributions date back to 2017, with major incremental unlocks over the years. The recent zkVM engineering breakthroughs have sparked a thorough design space exploration. This article is just our best attempt at piecing together a coherent design for a potential grand unification.

Abstract

We propose a elegant and powerful EXECUTE precompile, exposing the native L1 EVM execution engine to the application layer. Native Execution Aggregation (or "Native Aggregation") is a way to use EXECUTE to verify batches of user transactions' EVM state transitions. Native Aggregation can be viewed as "programmable execution sharding", wrapping the precompile in derived functions to handle non-EVM system logic like ordering, bridging, forced inclusion, governance.

Since the EXECUTE precompile is directly executed by the verifiers, it enjoys (zk)EL client diversity and provides EVM equivalence that is sound by construction and forward compatible with EVM upgrades via L1 hard forks. For EVM equivalent aggregations that fully inherit Ethereum's security, an EVM introspective form like EXECUTE precompile is necessary. We'll refer to such fully Ethereum-secure aggregations as "trustless aggregation".

The EXECUTE precompile greatly simplifies the development of EVM equivalent aggregation, as it avoids the need for complex infrastructure (e.g. fraud proofs, SNARK circuits, security committees) to simulate and maintain the EVM. With EXECUTE, a few lines of Solidity code using simple derived functions is all that's needed to deploy minimal native aggregation and aggregation-based, without special handling for ordering, forced inclusion or governance.

Most importantly, native aggregation can enjoy instant finality without worrying about instant proofs, greatly simplifying composability.

The article is split into two parts, first introducing the proposed precompile, and then discussing native aggregation.

Part 1 — The EXECUTE Precompile

Structure

The EXECUTE precompile takes inputs pre_state_root, post_state_root, trace and gas_used. It returns true if and only if:

trace is a well-formed execution trace (e.g. L2 transaction list and corresponding state access proofs)
the stateless execution of trace starts from pre_state_root and ends at post_state_root
the stateless execution of trace consumes exactly gas_used gas

There is an EIP-1559-like mechanism for metering and pricing all EXECUTE calls consumed in an L1 block. Specifically, there is an EXECUTE_CUMULATIVE_GAS_LIMIT and an EXECUTE_CUMULATIVE_GAS_TARGET. (When the L1 EVM can be stateless executed by verifiers, the cumulative limit and target can be merged with the L1 EIP-1559 mechanism.)

Calling the precompile requires paying a fixed amount of L1 gas, EXECUTE_GAS_COST, plus gas_used * gas_price, where gas_price (in ETH/gas) is set by the EIP-1559-like mechanism. The full prepaid amount is extracted even if the precompile returns false.

The trace must point to available Ethereum data from calldata, blobs, state or memory.

Re-execution

If the EXECUTE_CUMULATIVE_GAS_LIMIT is small enough, verifiers can simply re-execute the trace to enforce the correctness of the EXECUTE call. The initial deployment of precompile-based aggregation can serve as a stepping stone, similar to the simple blob re-download in original danksharding. Note that simple re-execution does not incur state growth or bandwidth costs for verifiers, and any execution costs can be parallelized across CPU cores.

Verifiers must hold an explicit copy of the trace to re-execute, preventing the use of pointers to blob data sampled via DAS instead of downloaded. Note that optimistic native aggregation may still publish aggregation data in the form of blobs, only falling back to calldata in the fraud proof game. Also note that optimistic native aggregation can have gas limits far exceeding EXECUTE_CUMULATIVE_GAS_LIMIT, since the EXECUTE precompile only needs to be called once on a small EVM segment to resolve fraud proof challenges.

For historical record, in 2017 Vitalik proposed a similar "EVM inside EVM" precompile 16, called EXECTX.

SNARK-based Execution

To unlock larger EXECUTE_CUMULATIVE_GAS_LIMIT, naturally we'll have verifiers selectively verify SNARK proofs. From here on, we assume a slotted delayed execution, where invalid blocks (or invalid transactions) are treated as no-ops. (For more on delayed execution, see this ethresearch post 15, this EIP 18 and Francesco's design 19.) A slotted delayed execution will incur a few seconds (the entire slot) for proving. They also avoid incentivizing MEV-driven proof races, which would introduce centralization vectors.

Note that even when EXECUTE is enforced by SNARKs, there are no explicit proof systems or circuits incorporated into consensus. (Note that the EXECUTE precompile does not take any explicit proofs as input.) Instead, each staking operator is free to choose their favorite zkEL verifier client, similar to how they subjectively choose EL clients today. The "Offchain Proofs" section below will explain the benefits of this design choice.

From here on, we assume mature execution proposers in the context of a Proposer-Validator Separation (PVS) with alternating execution and consensus slots. To incentivize rational execution proposers to generate proofs in a timely manner (within 1 slot), we require validators to only validate execution block n+1 when the proof for execution block n is available. (We suggest bundling block n+1 with the EXECUTE proof for block n at the p2p layer.) Execution proposers skipping proofs may miss their slots, incurring missed fees and MEV. We further impose a fixed penalty for missed execution slots, set high enough (e.g. 1 ETH) to always exceed the cost of proving.

Note that in the PVS context, consensus block production is not blocked by missed execution slots. However, timely proof generation is important for light clients, so they can easily read state at the chain without stateless re-execution. To ensure timely proof generation for light clients, even in the special case where the next execution proposer misses their slot, we rely on an altruistic minority prover assumption. A single altruistic prover is sufficient to generate the proof within 1 slot. To avoid unnecessary redundant proofs, the altruistic majority can wait on standby and only activate when no proof arrives within 1 slot, serving as a at most 2 slot delayed fault tolerance measure.

Note that the EXECUTE_CUMULATIVE_GAS_LIMIT needs to be set low enough for the altruistic minority prover assumption to be credible (as well as to keep execution proposals from becoming unrealistically complex). A conservative strategy could be to set the EXECUTE_CUMULATIVE_GAS_LIMIT such that a laptop (e.g. high-end MacBook Pro) can access single-slot proofs. A more pragmatic and aggressive policy might be to target a small GPU fraction, and eventually SNARK ASIC provers as they become sufficiently commoditized.

Offchain Proofs

To reiterate, we propose not to store the zkEL EXECUTE proofs on-chain, but to share them off-chain. Not storing the proofs is a good idea, first proposed by Vitalik 22, and it has several advantages:

Diversity: Verifiers can freely choose proof verifier clients (including proof systems and circuits) from the teams they trust, similar to how they choose trusted EL clients today. This provides robustness through diversity. zkEL verifier clients (and the underlying zkVM of some clients) are complex cryptographic software. Bugs in any one client should not cause Ethereum to crash.
Neutrality: Having a competitive zkEL verifier client market allows the consensus layer to avoid picking technology winners. For example, a heated zkVM market competition, choosing a winning provider (like Risc0, Succinct or many others 31) may not be viewed as neutral.
Simplicity: The consensus layer does not need to include a specific SNARK verifier, greatly simplifying the consensus layer specification. It only needs to include the format of state access proofs, not the implementation details of specific proof verifier.
Flexibility: If bugs or optimizations are found, affected verifiers can update their clients without hard forks.

Having offchain proofs does introduce some manageable complexities:

Proof Load and P2P Fragmentation: Due to the lack of a single standard proof, multiple proofs need to be generated (at least one per zkEL client). Each customization of a zkEL client (e.g., replacing one RISC-V zkVM with another) requires a different proof. Similarly, each upgrade of a zkEL version requires a different proof. This will lead to an increase in proof load. If each proof type has a separate gossip channel, it will further fragment the p2p network.
Minority zkELs: It is difficult to incentivize minority zkELs to generate proofs. Rational proposers may only generate enough proofs to reach the vast majority of provers, without missing their slots. To address this, social encouragement of staked operators running multiple zkEL clients in parallel, similar to today's Vouch 4 operators, can be used. Running a k-of-n setup also has the additional benefit of improving security, particularly in preventing soundness vulnerabilities that allow attackers to produce proofs for arbitrary EXECUTE calls (which is not common for traditional EL clients).

Offchain proofs will also reduce the efficiency of real-time settlement L2s:

No Substitute DA: Since the traced inputs for EXECUTE need to be provided to L1 verifiers, real-time settlement L2s (i.e., L2s that immediately update their canonical state root) must consume L1 DA, i.e., aggregation. Note that optimistic L2s with delayed settlement via fraud proof games do not have this constraint and can be liveness-efficient.
State Access Overhead: Since the trace must be stateless executable, it must include the state trie leaves that were read or written, introducing a small DA overhead compared to typical L2 blocks. Note that optimistic L2s do not have this constraint, as state trie leaves are only needed in fraud proof challenges, which the challenger can recompute.
Stateless Diffs: Since the trace is given, the proof should be permissionless, and thus state diff aggregation is not possible. However, if the corresponding specialized proof is incorporated into consensus, the stateless access proofs or EVM transaction signatures can be compressed.

RISC-V Native Execution

Given the de facto convergence towards RISC-V zkVMs today, there may be an opportunity to natively expose RISC-V state transitions to the EVM (similar to the Arbitrum Stylus environment) while maintaining SNARK-friendliness.

Part 2 - Native Rollups

Naming

We first discuss the naming of Native Rollups to address a few easily confusing issues:

Alternative Names: Native Rollups were previously referred to as "enshrined" Rollups, e.g., see this paper 13 and this paper 7. (The term "canonical Rollups" was also briefly used in Polynya 12.) The "enshrined" term was later abandoned in favor of "native" to indicate that existing EVM-equivalent Rollups can choose to upgrade to native.
Rollup-Based: Rollup-based and native Rollups are orthogonal concepts: "based" is about L1 ordering, while "native" is about L1 execution. Rollups that are both based and native have been whimsically referred to as "supersonic Rollups".
Execution Sharding: Execution Sharding (i.e., enshrined copies of the L1 EVM chain) is a different but related concept to native Rollups, predating native Rollups by a few years. (Execution Sharding was previously the "Phase 2" of the Ethereum 2.0 roadmap.) Unlike native Rollups, Execution Sharding is not programmable, i.e., no custom governance, custom ordering, custom gas tokens, etc. Execution Sharding is also typically instantiated in a fixed number of instances (e.g., 64 or 1,024 shards). Unfortunately, Martin Köppelmann used the term "native L2" in his 2024 Devcon talk about Execution Sharding 7.

Benefits

Native Rollups have several benefits, which we will detail below:

Simplicity: Most of the complexity of a native Rollup VM can be encapsulated through precompiles. Today, Optimism and zk-Rollups equivalent to the EVM have thousands of lines of code for their fraud proof games or SNARK verifiers, which can be compressed into a single line of code. Native Rollups also do not require auxiliary infrastructure like proof networks, watchtowers, and security committees.
Security: Building bug-free EVM fraud proof games or SNARK verifiers is an extremely difficult engineering task, likely requiring deep formal verification. Today, each Optimism and zk-EVM Rollup almost certainly has severe vulnerabilities in their EVM state transition functions. To mitigate vulnerabilities, centralized ordering is often used as a crutch to control adversarial block production. Native execution precompiles allow secure, permissionless ordering to be deployed. Fully inheriting L1 security, the trustless Rollup also fully inherits L1 asset fungibility.
EVM Equivalence: Today, the only way for Rollups to stay in sync with the L1 EVM rules is to have governance (typically a security committee and/or governance tokens) mirror L1 EVM upgrades. (EVM updates still occur through roughly annual hard forks.) Governance is not only an attack vector, but it also, strictly speaking, deviates from the L1 EVM and prevents any Rollup from achieving true long-term EVM equivalence. In contrast, native Rollups can synchronize upgrades with L1 without governance.
SNARK Gas Costs: On-chain SNARK verification is expensive. Thus, many zk-Rollups settle infrequently to minimize costs. Since SNARKs are not verified on-chain, the EXECUTE precompile can be used to reduce verification costs. If SNARK recursion is used to batch-process the EXECUTE proofs for multiple calls in a block, the EXECUTE_GAS_COST can be set relatively low.
Synchronous Composability: Today, synchronous composability with L1 requires same-slot real-time proofs. For zk-Rollups, achieving ultra-low latency proofs (e.g., around 100ms) is an especially challenging engineering task. Using single-slot delayed state root proofs, the proof latency of the native execution precompile can be relaxed to a full slot.

Sector:

Smart Contracts

Layer 1

Source

Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.

Add to Favorites

Comments

Relevant content

ODAILY

The Black Swan Event Was Actually This: The Real Reason for the Recent Bitcoin Crash

BTC

2.38%

MarsBit

The Black Swan Event Was Actually This: The Real Reason for the Recent Bitcoin Crash

BTC

2.38%

ME News

Breaking News! The Year of China's RWA: A Compliant Channel Opens for Trillions of Yuan in Domestic Assets to Go Global