Discovering and Fixing a Critical Vulnerability in Polygon zkEVM

Polygon zkEVM is a layer-2 solution designed to enhance Ethereum's scalability through off-chain transaction processing and the use of zero-knowledge proofs. This vulnerability, located within the zkProver component, introduced a significant security loophole, making the network susceptible to proof forgery attacks, which could pose substantial threats to the integrity and security of funds across both layer 1 and layer 2 of the blockchain.

Following identification of the vulnerability, Verichains and the Polygon team collaborated on the development of the fix, which was then reviewed and implemented on mainnet in December 2023.

Discovery

The vulnerability was uncovered by Verichains during our security research of zero-knowledge virtual machines (zkVM). After successfully reproducing the bug, where an attacker is able to generate counterfeit proofs and manipulate state changes within the network, the finding was swiftly reported via the Immunefi bug bounty platform. Immunefi, recognizing the severity of the issue, escalated the matter to the Polygon zkEVM team. The teams discussed the nature of the bug, which was acknowledged as critical. Because the exploit relied on malice from the Trusted Aggregator, a centralized module operated by the zkEVM team, there was a low probability of exploitation.

Background

Polygon zkEVM is a layer-2 solution known as a zero-knowledge rollup, which is designed to enhance Ethereum's scalability by processing transactions off-chain and validating state transitions with zero-knowledge proofs. It integrates zero-knowledge proofs with a virtual machine engineered to mimic the Ethereum Virtual Machine (EVM), ensuring compatibility with existing Ethereum tooling and smart contracts.

The Vulnerability

The core functionality of Polygon zkEVM is anchored in the zkProver module, responsible for validating transactions with zero-knowledge proofs. The zkProver performs sophisticated mathematical computations to validate transactions by generating proofs of correctness, which are then verified by a smart contract on the underlying layer 1. These proofs are crucial in ensuring transaction validity while providing an execution environment that is more efficient and less expensive. A verified proof leads to a state change in the network; thus, the ability to forge proofs gives the Trusted Aggregator the ability to alter the network state maliciously, if they were so inclined, potentially leading to loss of funds or freezing assets across the network.

The vulnerability stems from weaknesses in the zkProver's complex recursive proof generation process, enabling attackers to manipulate the network's state. We demonstrated this through a critical proof forgery attack that functioned under any conditions. Due to the zero-knowledge properties of the system, the methodology for generating a forged proof remains concealed.

The prover in zkEVM employs eSTARK (an extended version of STARK) as its primary backend protocol. To reduce the gas cost of proof verification on layer 1, zkEVM uses recursive proving techniques. In the final recursion step, a STARK is converted to a SNARK to achieve constant proof size and verification time. However, there's an incompatibility between the fields on which STARK and SNARK operate. STARK works with elements from F_p^3 , where p = 2⁶⁴ – 2³² + 1, to facilitate efficient arithmetic operations. In contrast, SNARK works with elements from F_q, where q is tied with a pairing-friendly elliptic curve group and, in our case, is 254 bits long. This incompatibility introduces potential security issues.

The first weakness involves the computation of Merkle roots (the FRI-based commitments to polynomials). A tree’s leaf is the hash of multiple F_q elements. Given that an element (x, y, z) ∈ F_p^3 (approximately 192-bit entropy) can be uniquely represented by an element of F_q (approximately 254-bit entropy), it makes sense to, before hashing, convert each element of F_p^3 to an element of F_q via the map: (x, y, z) ↦ x + 2⁶⁴y + 2¹²⁸z to save some circuit constraints. However, this process does not ensure that the values associated with x, y, z are limited to 64-bit each, potentially allowing them to be any element in F_q.

The second weakness arises from an arithmetic gate that performs a multiply-then-add operation on three F_p^3 elements a, b, and c, returning an element (a*b + c) of F_p^3. Given that a ∗ b is roughly 128-bit long before doing the modulo reduction, as a result, the third operand c, is allocated more space than necessary.

These mathematical vulnerabilities in the backend of the zkEVM prover are sufficient for a system compromise.

Impact and Implications

The vulnerability carried significant potential repercussions, enabling the Trusted Aggregator to craft valid proofs for any given computation. This breach could have led to unauthorized modifications within the network, ultimately resulting in the loss of funds from the layer-2 network, and potentially affecting layer-1 deposits as well.

Our team's proof of concept (PoC) demonstrated the feasibility of generating fraudulent proofs for the Fork ID 4 iteration of zkEVM on Ethereum mainnet and the Fork ID 5 iteration on testnet. For reference, the latest mainnet iteration for Polygon zkEVM is Fork ID 8.

The procedure involved:

Generation of Invalid Proofs: We generated two counterfeit proofs to validate Fork ID 4 zkEVM on the Ethereum mainnet at block heights 18066976 and 18026062, and one for validating Fork ID 5 zkEVM on the Goerli testnet at block height 9679280.
Execution Process: Utilizing specific scripts, we compiled and executed them with precise inputs to produce and validate the fake proofs. This process altered critical state parameters of zkEVM (StateRoot and LocalExitRoot) to predetermined values, thus modifying the network's state.
Observations: Prior to execution, we verified and recorded the network's state roots, which were legitimate. Post-execution, the state roots were altered, indicating successful manipulation.

The proofs effectively nullified zkEVM's StateRoot and LocalExitRoot to 0x0 (or any predetermined value), essentially erasing the network's state, including balances and deposits.

Remediation

Following Verichains' disclosure, the Polygon zkEVM team conducted an exhaustive review to fully grasp the vulnerability's dynamics and implications. The following summarizes the adjustments made to pil-stark, as validated by the Verichains team:

GL value constraints: Added constraints to ensure that all inputs to the recursiveF verifier circuit are below 2^64. These were incorporated into the StarkVerifier Bn128 template.
GL operations: Introduced new templates for additions and subtractions within the BN128 field, named GLSub (and GLCSub), and GLAdd (andGLCAdd). Additionally, operations for multiplication and addition were segregated, eliminating the use of GLCMulAdd for query or evaluation verifications.
GL operation tags: New tags were added to the recursiveF verifier circuit for more precise constraint application during GL operations in the BN128 field. A tag, {maxNum}, denotes the maximum possible signal value, set at p – 1, with p = 0xFFFFFFFF00000001.
Testing: Adjustments were made to accommodate tags in tests. Two new tests were devised to probe edge cases in VerifyEvaluations and VerifyQuery, ensuring tags are correctly applied. One test challenges the template with maximum input values (p – 1), while the other modifies the template to maintain tag values without performing subtraction.

Links to the fix:

We appreciate the Polygon zkEVM team's prompt and effective response to our reported vulnerability. Additionally, we extend our sincere thanks for the bug bounty. The engagement showed a clear dedication to security and professionalism towards the cybersecurity community. It is a testament to the value of open communication and constructive exchange of knowledge that enables developers and security professionals to enhance the security for blockchain ecosystems.

About Us

Verichains is a leading provider of blockchain security solutions, specializing in cryptanalysis, security audits, and application security solutions. Renowned for investigating and mitigating some of the largest Web3 hacks, such as Ronin and BNB Chain Bridge, we blend groundbreaking research with practical security solutions to deliver comprehensive protection for the blockchain industry.

Verichains’ world-class security and cryptography research team has successfully identified critical vulnerabilities impacting billions of dollars across the industry, uncovering flaws within the core of Multi-Party Computation (MPC) and Zero-Knowledge Proofs (ZKP) implementations by major vendors. As a trusted security partner to leading Web3 companies like BnB Chain, Polygon Labs, Wemix, Aptos, Klaytn, Bullish and DWF Labs, Verichains leverages its deep roots in traditional cybersecurity to deliver cutting-edge solutions for a safer, more secure Web3 ecosystem.

Sector:

Modular Blockchain

Zero Knowledge Proofs

Layer 2

Source

Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.

Add to Favorites