Original title: Possible futures for the Ethereum protocol, part 2: The Surge
Author: Vitalik, founder of Ethereum, Translated by: Deng Tong, Jinse Finance
Special thanks to Justin Drake, Francesco, Hsiao-wei Wang, @antonttc and Georgios Konstantopoulos
At the beginning, there were two expansion strategies in Ethereum's roadmap. One of them was "sharding" : each node only needs to verify and store a small part of the transactions, instead of verifying and storing all the transactions in the chain. This is also Any other peer-to-peer network (e.g. BitTorrent) works, so we can certainly make blockchain work the same way. Another is a layer 2 protocol: networks will sit on top of Ethereum, allowing them to fully benefit from its security The term “layer 2 protocol” refers to state channels in 2015, Plasma in 2017, and Rollups in 2019. Rollups are more powerful than state channels or Plasma, but they are Requires a lot of on-chain data bandwidth. Fortunately, by 2019, sharding research had solved the problem of verifying "data availability" at scale. As a result, the two paths merged and we got a rollup-centric roadmap. , which is still Ethereum’s scaling strategy today.
The Surge, 2023 Roadmap Edition.
The rollup-centric roadmap proposes a simple division of labor: Ethereum L1 focuses on becoming a strong and decentralized base layer, while L2 takes on the task of helping the ecosystem scale. This is a pattern that has been repeated throughout society. : The court system (L1) is not meant to be super fast and efficient, it is meant to protect contracts and property rights, and entrepreneurs (L2) need to build on top of that solid foundational layer and bring humanity (metaphorically and literally) of) Mars.
This year, the rollup-centric roadmap has had important successes: Ethereum L1 data bandwidth has increased significantly via the EIP-4844 blob, and multiple EVM rollups are now in phase 1. The very heterogeneous and diverse implementation of sharding, A scenario where each L2 acts as a “shard” with its own internal rules and logic is now a reality. But as we have seen, going this route has its own unique challenges. So now our task is to complete A Rollup-centric roadmap that addresses these issues while retaining the robustness and decentralization that makes Ethereum L1 unique.
Surge: Key Objectives
L1+L2 100,000+ TPS
Keeping L1 decentralized and robust
At least some L2 fully inherits the core properties of Ethereum (trustlessness, openness, censorship resistance)
Maximum interoperability between L2s. Ethereum should feel like one ecosystem, not 34 different blockchains.
The Scalability Trilemma
The scalability trilemma is an idea proposed in 2017 that posits that there is a tension between three properties of a blockchain: decentralization (more specifically: low cost of running a node), scalability (more specifically: low latency), and cost-effectiveness. Specifically: processing a large number of transactions) and security (more specifically: an attacker would need to compromise a majority of the nodes in the entire network to make a single transaction fail).
It is worth noting that the trilemma is not a theorem, and the post introducing the trilemma does not come with a mathematical proof. It gives a heuristic mathematical argument: if a decentralization-friendly node (such as a consumer laptop) can verify N transactions, and you have a chain that processes k*N transactions per second, then (i) each transaction can only be seen by 1/k of the nodes, which means that an attacker only needs to compromise a few nodes Either (i) your nodes will become powerful and your chain is not decentralized. The purpose of this post was never to show that breaking the trilemma is impossible; rather, it was to Shows that breaking the trilemma is difficult—it requires somehow thinking outside the box that the argument implies.
Over the years, some high-performance chains have often claimed that they solved the trilemma without doing anything clever at the infrastructure level, usually by using software engineering tricks to optimize nodes. This is always misleading and in such cases Running a node in L1chain has always been much harder than running one in Ethereum. This post explores many of the subtleties of why this is the case (and why L1 client software engineering alone cannot scale Ethereum itself).
However, the combination of data availability sampling and SNARKs does solve the trilemma: it allows a client to verify that a certain amount of data is available, and that a certain number of computational steps were performed correctly, while only downloading a small portion of that data and running The computational effort is much smaller. SNARK is trustless. Data availability sampling has a subtle minority-N trust model, but it retains the fundamental property of a non-scalable chain that even a 51% attack cannot force the network to accept a bad block.
Another approach to solving the trilemma is the Plasma architecture, which uses clever techniques to push the responsibility of monitoring data availability onto users in an incentive-compatible way. Back in 2017-2019, when all we needed to scale computation was fraud proofs, At the time, Plasma’s security capabilities were very limited, but the mainstreaming of SNARKs has made the Plasma architecture more applicable to a wider range of use cases than before.
Further progress on data availability sampling
What problem are we trying to solve?
As of March 13, 2024, when the Dencun upgrade goes live, the Ethereum blockchain will have 3 “blobs” of approximately 125 kB per 12-second period, or approximately 375 kB of data available bandwidth per period. When published to the chain, the ERC20 transfer is about 180 bytes, so the maximum TPS of rollups on Ethereum is:
375000 / 12 / 180 = 173.6 TPS
If we add Ethereum’s calldata (theoretical maximum: 30 million gas per slot / 16 gas per byte = 1,875,000 bytes per slot), this becomes 607 TPS. Increasing the count target to 8-16 will give us calldata of 463-926 TPS.
This is a significant improvement over Ethereum L1, but it’s not enough. We want more scalability. Our mid-term goal is 16 MB per slot, which, if combined with improvements in rollup data compression, , will give us about 58,000 TPS.
What is it and how does it work?
PeerDAS is a relatively simple implementation of "one-dimensional sampling". Each blob in Ethereum is a 4096-degree polynomial over a 253-bit prime number field. We broadcast "shares" of the polynomial, where each share consists of a single scalar from a total of 8192 coordinates. The blob is composed of 16 evaluations at 16 adjacent coordinates taken in the set. Any 4096 of the 8192 evaluations (using the currently proposed parameters: any 64 of the 128 possible samples) can recover the blob.
PeerDAS works by having each client listen to a small number of subnets, where the ith subnet broadcasts the ith sample of any blob, and additionally requests the required blob on other subnets by asking peers in the global p2p network. Blob (who will listen on different subnets). A more conservative version, SubnetDAS, uses only the subnet mechanism, without the additional request peer layer. The current proposal is that nodes participating in proof of stake use SubnetDAS, and other nodes (i.e. "clients") )Use PeerDAS.
In theory, we can extend 1D sampling quite far: if we increase the blob count maximum to 256 (thus, a target of 128), then we will reach the 16MB target with data availability sampling costing only 16 per node. samples * 128 blobs * 512 bytes per sample per blob = 1 MB of data bandwidth per slot. This is just within our tolerance: it is doable, but it means that bandwidth-constrained clients Unable to sample. We can optimize this by reducing the number of blobs and increasing the blob size, but this will make reconstruction more expensive.
So eventually we want to go a step further and do 2D sampling, which does this by randomly sampling not only within a blob, but also between blobs. The linear property promised by KZG is used to encode the same information redundantly through a new “ The set of blobs in a block can be “extended” by using a list of “virtual blobs”.
2D sampling. Source: a16z
Crucially, the computational commitment extension does not require blobs, so the scheme is fundamentally friendly to distributed block construction. The nodes that actually build the blocks only need to have the Blob KZG commitment, and can rely on DAS to verify the Blob itself. 1D DAS is also inherently friendly to distributed block construction.
What are the connections with existing research?
Original article introducing data availability (2018): https://github.com/ethereum/research/wiki/A-note-on-data-availability-and-erasure-coding
Follow-up paper: https://arxiv.org/abs/1809.09044
DAS’s explainer post, Paradigm: https://www.paradigm.xyz/2022/08/das
2D data availability with KZG commitments: https://ethresear.ch/t/2d-data-availability-with-kate-commitments/8081
PeerDAS on ethresear.ch: https://ethresear.ch/t/peerdas-a-simpler-das-approach-using-battle-tested-p2p-components/16541 and paper: https://eprint.iacr.org /2024/1362
EIP-7594: https://eips.ethereum.org/EIPS/eip-7594
SubnetDAS on ethresear.ch: https://ethresear.ch/t/subnetdas-an-intermediate-das-approach/17169
Nuances of data recoverability in 2D sampling: https://ethresear.ch/t/nuances-of-data-recoverability-in-data-availability-sampling/16256
What else needs to be done and what trade-offs need to be made?
The next step is to complete the implementation and rollout of PeerDAS. From there, it will be an incremental effort to continually increase the blob count on PeerDAS while carefully watching the network and improving the software to ensure security. In the meantime, we hope to conduct more research on Academic work on the formalization of PeerDAS and other versions of DAS and their interaction with issues such as the safety of fork choice rules.
Looking ahead, we need to do more work to figure out the ideal version of 2D DAS and prove its security properties. We also hope to eventually migrate from KZG to quantum-resistant, trusted-free alternatives. Currently, we do not know which ones are The candidate is friendly to distributed block construction. Even the expensive “brute force” technique of using recursive STARKs to generate validity proofs for reconstructed rows and columns is not enough, because technically the hash size of a STARK is O(log( n) * log(log(n)) (with STIR), in fact STARK is almost as big as the entire spot.
In the long run, I think the realistic path is:
Ideal 2D DAS tool;
Stick with 1D DAS, sacrificing sampling bandwidth efficiency and accepting a lower data cap for simplicity and robustness;
(Hard Pivot) Abandon DA and fully embrace Plasma as the primary layer 2 architecture we focus on.
We can look at these in terms of trade-offs:
Note that this choice remains even if we decide to scale execution directly on L1. This is because if L1 is to handle a large number of TPS, L1 blocks will become very large and clients will need an efficient way to verify that they are correct. , so we have to use the same technology that supports rollups (ZK-EVM and DAS) and L1.
How does it interact with the rest of the roadmap?
If data compression is implemented (see below), the need for 2D DAS will be reduced, or at least delayed, and if Plasma becomes widely used, the need for 2D DAS will be further reduced. DAS also has important implications for distributed block construction protocols and The mechanism poses a challenge: while DAS is theoretically friendly to distributed refactoring, in practice it needs to be combined with inclusion list proposals and the fork choice mechanism surrounding them.
Data Compression
What problem are we trying to solve?
Each transaction in a Rollup takes up a lot of on-chain data space: an ERC20 transfer takes about 180 bytes. Even with ideal data availability sampling, this limits the scalability of layer 2 protocols. 16 MB per slot , we get:
16000000 / 12 / 180 = 7407 TPS
What if in addition to solving the numerator, we could also solve the denominator and make each transaction in the rollup take up fewer bytes on-chain?
What is it and how does it work?
I think the best explanation is this picture from two years ago:
The simplest gain is zero-byte compression: replace each long sequence of zero bytes with two bytes representing the number of zero bytes. Going a step further, we exploit specific properties of transactions:
Signature aggregation - We switched from ECDSA signatures to BLS signatures, which have the property that many signatures can be combined into a single signature that can prove the validity of all the original signatures. L1 does not consider this because the computational cost of verification is These are higher (even with aggregation), but in a data-scarce environment like L2, they arguably make sense. ERC-4337’s aggregation capabilities provide one way to do this.
Replace addresses with pointers - If the address has been used before, we can replace the 20-byte address with a 4-byte pointer to the historical location. This is necessary to achieve the maximum benefit, although it takes effort to implement because it requires (at least part of) ) The history of blockchain can effectively become part of the state.
Custom serialization of transaction values - Most transaction values have only a few digits, eg. 0.25 ETH is represented as 250,000,000,000,000,000 wei. Gas max-basefees and priority fees work similarly. Therefore, we can use a custom decimal floating point formats, or even dictionaries for particularly common values, represent most monetary values very compactly.
What are the connections with existing research?
Exploration from sequence.xyz: https://sequence.xyz/blog/compressing-calldata
Calldata optimization contract for L2, from ScopeLift: https://github.com/ScopeLift/l2-optimizoooors
Another strategy - proof-of-validity based rollups (aka ZK-rollups) publish state diffs instead of transactions: https://ethresear.ch/t/rollup-diff-compression-application-level-compression-strategies-to- reduce-the-l2-data-footprint-on-l1/9975
BLS Wallet - BLS aggregation via ERC-4337: https://github.com/getwax/bls-wallet
What else needs to be done and what trade-offs need to be made?
The main work left to do is to put the above solution into practice. The main trade-offs are:
Switching to BLS signatures requires significant effort and reduces compatibility with trusted hardware chips that would improve security. ZK-SNARK wrappers of other signature schemes could be used instead.
Dynamic compression (such as replacing addresses with pointers) complicates client code.
Publishing state differences to the chain instead of transactions reduces auditability and makes many software (such as block explorers) non-functional.
How does it interact with the rest of the roadmap?
The adoption of ERC-4337, and the eventual inclusion of parts of it into the L2 EVM, could significantly accelerate the deployment of aggregated technologies. Incorporating parts of ERC-4337 into L1 could accelerate its deployment on L2.
Generalized Plasma
What problem are we trying to solve?
Even with 16MB blobs and data compression, 58,000 TPS isn’t necessarily enough to completely take over consumer payments, decentralized social, or other high-bandwidth areas. This is especially true if we start to consider privacy, which could make scalability worse. 3 -8x. For high-volume, low-value applications, one option today is Validium, which keeps data off-chain and has an interesting security model where operators cannot steal users’ funds, but they can disappear and temporarily Or permanently freeze all users’ funds. But we can do better.
What is it and how does it work?
Plasma is a scaling solution that involves operators publishing blocks off-chain and placing the Merkle roots of those blocks on-chain (unlike Rollups, which place the entire block on-chain). Block, the operator sends each user a Merkle branch proving what did or did not happen to that user's assets. Users can withdraw their assets by providing the Merkle branch. Importantly, this branch does not have to be rooted in the latest state - so , even if data availability fails, users can still recover their assets by withdrawing the latest available state. If a user submits an invalid branch (for example, withdrawing assets they have already sent to others, or the operator creates assets out of thin air), the on-chain challenge The mechanism can determine to whom the asset correctly belongs.
Plasma Cash chain graph. A transaction that spends coin i is put into the i-th position in the tree. In this example, assuming all previous trees are valid, we know that Eve currently owns coin 1, David owns coin 4, and George owns coin 5. Have Coin 6.
Early versions of Plasma could only handle the payment use case and could not be effectively generalized further. However, if we require each root to be verified with a SNARK, then Plasma becomes much more powerful. Each challenge game can be greatly simplified because we Most possible paths for operators to cheat are eliminated. New paths are also opened up, allowing Plasma technology to be expanded to a wider range of asset classes. Finally, in the event that the operator does not cheat, users can withdraw funds immediately without waiting a week. challenging period.
One way (not the only way) to make an EVM Plasma chain: Use ZK-SNARK to build a parallel UTXO tree that reflects the balance changes made by the EVM and defines what is a unique mapping of "the same coin" to different points in history. Then you can The Plasma structure is built on this basis.
An important insight is that Plasma systems don’t need to be perfect. Even if you can only secure a portion of your assets (e.g. even just tokens that haven’t moved in the past week), you’ve already made a significant improvement on the status quo of a hyper-scalable EVM, which is validation.
Another type of structure is a hybrid Plasma/rollups structure, such as Intmax. These structures put a very small amount of data for each user on the chain (e.g. 5 bytes), and by doing so, they can achieve a performance somewhere between Plasma and rollups. Properties: In the Intmax case you get very high levels of scalability and privacy, even in a 16 MB world the theoretical capacity cap is about 16,000,000 / 12 / 5 = 266,667 TPS.
What are the connections with existing research?
Original Plasma paper: https://plasma.io/plasma-deprecated.pdf
Plasma Cash: https://ethresear.ch/t/plasma-cash-plasma-with-much-less-per-user-data-checking/1298
Plasma Cashflow: https://hackmd.io/DgzmJIRjSzCYvl4lUjZXNQ?view#-Exit
Intmax (2023): https://eprint.iacr.org/2023/1082
What else needs to be done and what trade-offs need to be made?
The main task remaining is to put the Plasma system into production. As mentioned above, "plasma vs. validium" is not a binary opposition: any validium can improve security by adding Plasma features to the exit mechanism at least a little bit. The research part is In order to obtain the best properties of the EVM (in terms of trust requirements, worst-case L1 gas costs, and DoS vulnerability) and alternative application-specific structures. In addition, Plasma has a greater conceptual complexity than rollups and needs to be implemented through Research and build better general frameworks to address this directly.
The main disadvantage of using Plasma designs is that they are more operator-dependent and harder to “base on”, although hybrid Plasma/rollup designs can usually avoid this weakness.
How does it interact with the rest of the roadmap?
The more efficient the Plasma solution, the less pressure there is on L1 to have high performance data availability capabilities. Moving activity to L2 also reduces the pressure on MEVs on L1.
Mature L2 proof system
What problem are we trying to solve?
Today, most rollups are not actually trustless; there is a security council that has the ability to overturn the behavior of the (optimism or validity) proof system. In some cases, the proof system does not even exist, or if it does exist, it is only The leading ones are (i) some application-specific rollups, such as Fuel, which are trustless, and (ii) Optimism and Arbitrum, two full EVM rollups as of this writing. A partial trustless milestone called “Phase 1” has already been achieved. The reason Rollups haven’t gone further is because of concerns about bugs in the code. We need trustless rollups, so we need to address this problem head on.
What is it and how does it work?
First, let's review the "stage" system that was initially introduced in this article. There are more detailed requirements, but in summary:
Phase 0: Users must be able to run a node and sync the chain. This is OK if validation is fully trusted/centralized.
Phase 1: There must be a (trustless) proof system that ensures only valid transactions are accepted. A security committee that can overturn the proof system is allowed, but only with a 75% voting threshold. Additionally, the quorum of the council prevents partial (i.e. above 26%) must be outside the primary company building the rollup. Less powerful upgrade mechanisms (e.g. DAOs) are permitted, but there must be a long enough delay so that if a malicious upgrade is approved, users can exit before the upgrade goes live. funds.
Phase 2: There must be a (trustless) proof system to ensure that only valid transactions are accepted. The Council is only allowed to intervene if there is a provable bug in the code, e.g. if two redundant proof systems disagree with each other , or if a proof system accepts two different post-state roots for the same block (or doesn't accept anything for a long enough period of time, like a week). Upgrade mechanisms are allowed, but they must have very long delays.
Our goal is to reach stage 2. The main challenge in reaching stage 2 is to gain enough confidence that the attestation system is actually trustworthy enough. There are two main ways to do this:
Formal Verification: We can use modern mathematical and computational techniques to prove (optimistically or effectively) that a proof system only accepts blocks that pass the EVM specification. These techniques have been around for decades, but recent advances (e.g. Lean 4) have made them more reliable. more practical, and advances in AI-assisted proofing may further accelerate this trend.
Multi-Provers: Make multi-attestation systems and stake money on 2-of-3 (or larger) multi-signatures between those attestation systems and security committees (and/or other gadgets with trust assumptions, such as TEEs). If the proof system agrees, the Council has no power. If they disagree, the Council can only choose one of them and cannot unilaterally impose its own answer.
A stylized diagram of multiple provers, combining an optimistic proof system, a validity proof system, and a safety committee.
What are the connections with existing research?
EVM K Semantics (formal verification work since 2017): https://github.com/runtimeverification/evm-semantics
A presentation on the multi-prover idea (2022): https://www.youtube.com/watch?v=6hfVzCWT6YI
Taiko plans to use multi-proofs: https://docs.taiko.xyz/core-concepts/multi-proofs/
What else needs to be done and what trade-offs need to be made?
For formal verification, there is a lot to it. We need to create a formally verified version of the entire SNARK prover for the EVM. This is an extremely complex project, although we have already started. There is a trick that can significantly simplify the task: we can create a minimal virtual Make a formally verified SNARK prover in, e.g. RISC-V or Cairo, then write an implementation of the EVM in that minimal VM (and formally prove its equivalence to some other EVM specification).
There are two main remaining parts to multi-provers. First, we need to have enough confidence in at least two different proof systems that are both reasonably secure on their own and that if they break, they break in different and distinct ways. Second, we need to get a very high level of guarantees in the underlying logic of the merge proof system. This is a small piece of code. There are ways to make it very small. - Simply store the funds in a secure multi-signature contract whose signers are contracts representing individual proof systems - but this comes at the expense of high on-chain gas costs. Some balance needs to be found between efficiency and security .
How does it interact with the rest of the roadmap?
Moving activities to L2 reduces the MEV pressure on L1.
Cross-L2 interoperability improvements
What problem are we trying to solve?
A big challenge with L2 ecosystems today is that they are difficult for users to manipulate. Furthermore, the simplest approaches often reintroduce trust assumptions: centralized bridges, RPC clients, etc. If we are serious about the idea of L2 being part of Ethereum, we need Make using the L2 ecosystem feel like using the unified Ethereum ecosystem.
A morbidly bad example (even dangerous: I personally lost $100 due to the wrong chain choice here) of cross-L2 UX - while this is not Polymarket's fault, cross-L2 interoperability should be standard for wallets and Ethereum Responsibility (ERC) ) community. In a well-functioning Ethereum ecosystem, sending tokens from L1 to L2 or from one L2 to another L2 should be just like sending tokens within the same L1.
What is it and how does it work?
There are many categories of cross-L2 interoperability improvements. Generally, the way to ask these questions is to note that in theory, Ethereum with rollups at its core is the same as L1 doing sharding, and then ask how the current Ethereum L2 version works in terms of In what areas does practice differ from the ideal? Here are some:
Chain specific addresses: The chain (L1, Optimism, Arbitrum...) should be part of the address. Once implemented, just put the address into the "Send" field to implement the cross-L2 send process, at which point the wallet can figure it out in the background How to send (including using bridge protocols).
Chain-specific payment requests: It should be easy and standardized to craft messages of the form “send me X token of type Y on chain Z”. This has two main use cases: (i) payments, either person-to-person or Individual to merchant services, and (ii) dapps requesting funds, e.g. the Polymarket example above.
Cross-chain swaps and gas payments: There should be a standardized open protocol to express cross-chain operations, such as “I send 1 ETH on Optimism to someone who sends 0.9999 ETH on Arbitrum” and “I send 0.0001 ETH on Optimism” Anyone including this transaction on Arbitrum". ERC-7683 is an attempt at the former, and RIP-7755 is an attempt at the latter, although both are more general than these specific use cases.
Light Clients: Users should be able to actually verify the chain they are interacting with, rather than just trusting the RPC provider. A16z Crypto’s Helios does this for Ethereum itself, but we need to extend this trustlessness to L2 ERC-3668 (CCIP-read) is one strategy to achieve this.
How a light client updates its view of the Ethereum header chain. Once you have the header chain, you can use Merkle proofs to verify any state object. Once you have the correct L1 state object, you can use Merkle proofs ( If you want to check pre-confirmations, and possibly signatures) to verify any state object on L2. Helios already does the former. Extending to the latter is a standardization challenge.
Keystore Wallet: Today, if you want to update the keys that control a smart contract wallet, you must do this on all N chains that wallet exists on. Keystore wallet is a technology that allows keys to exist on a The wallet is then read from any L2 that has a copy of the wallet. This means that updates only need to happen once. For efficiency, the Keystore wallet requires that L2 have a Standardized way to read L1 costlessly; two proposals for this are L1SLOAD and REMOTESTATICCALL.
A stylized diagram of how the Keystore wallet works.
A more radical “shared token bridge” idea: Imagine a world where all L2s are proof-of-validity rollups, with every slot dedicated to Ethereum. Even in this world, “natively” transferring assets from one L2 to Another L2 also needs to withdraw and deposit, which requires paying a lot of L1 Gas. One way to solve this problem is to create a shared minimal aggregation whose only function is to maintain which L2 has the balance of how many types of tokens, and Allowing these balances to be collectively updated through a series of crossovers. L2 send operations initiated by any L2. This will allow cross-L2 transfers to occur without having to pay L1 Gas for each transfer, and without requiring liquidity provider-based technologies such as ERC2016. -7683).
Synchronous composability: Allows synchronous calls to occur between a specific L2 and L1 or between multiple L2s. This may help improve the financial efficiency of defi protocols. The former can be done without any cross-L2 coordination; the latter requires Shared sequencing. Automatically friendly to all of these technologies based on aggregation.
What are the connections with existing research?
Chain specific address: ERC-3770: https://eips.ethereum.org/EIPS/eip-3770
ERC-7683: https://eips.ethereum.org/EIPS/eip-7683
RIP-7755: https://github.com/wilsoncusack/RIPs/blob/cross-l2-call-standard/RIPS/rip-7755.md
Rolling Keystore Wallet Design: https://hackmd.io/@haichen/keystore
Helios: https://github.com/a16z/helios
ERC-3668 (sometimes called CCIP-read): https://eips.ethereum.org/EIPS/eip-3668
Justin Drake’s proposal for “based on (shared) preconfirmations”: https://ethresear.ch/t/based-preconfirmations/17353
L1SLOAD (RIP-7728): https://ethereum-magicians.org/t/rip-7728-l1sload-precompile/20388
Remote calls in Optimism: https://github.com/ethereum-optimism/ecosystem-contributions/issues/76
AggLayer, which includes the idea of a shared token bridge: https://github.com/AggLayer
What else needs to be done and what trade-offs need to be made?
Many of the examples above face the standards dilemma of when to standardize and which layers to standardize. If you standardize too early, you risk inferior solutions. If you standardize too late, you risk unnecessary fragmentation. In some cases, there are both short-term solutions that are less powerful but easier to implement, and long-term solutions that are "eventually correct" but take quite a while to implement.
What is unique about this section is that these tasks are not just technical problems: they are also (perhaps primarily!) social problems. They require cooperation between L2 and wallets as well as L1. Our ability to successfully handle this is a testament to our success as a community. A test of our ability to stick together.
How does it interact with the rest of the roadmap?
Most of these proposals are “higher level” constructs and thus do not have much impact on L1 considerations. One exception is shared ordering, which has a large impact on MEV.
Extending execution on L1
What problem are we trying to solve?
If L2 becomes very scalable and successful, but L1 is still only able to process a very small number of transactions, there are a number of risks that could arise for Ethereum:
The economic situation of ETH assets has become more dangerous, which in turn affects the long-term security of the network.
Many L2s benefit from close ties to the highly developed financial ecosystem on L1, and if this ecosystem is significantly weakened, the incentive to become an L2 (rather than an independent L1) will be weakened.
It will take a long time for L2 to have exactly the same safety guarantees as L1.
If L2 fails (e.g. due to malicious actions or a vanishing operator), users still need to go through L1 to recover their assets. Therefore, L1 needs to be robust enough to actually handle the highly complex and messy end of L2, at least occasionally.
For these reasons, it is valuable to continue to expand L1 itself and ensure that it can continue to adapt to more and more uses.
What is it and how does it work?
The simplest way to scale is to simply increase the gas limit. However, this risks centralizing L1, thereby undermining another important property that makes Ethereum L1 so powerful: its credibility as a strong base layer. There is ongoing debate about how much the limit increase is sustainable, and this will also change depending on the implementation of other technologies to make larger blocks easier to verify (e.g. history expiration, statelessness, L1 EVM validity proofs, etc.) ). Another important thing that needs to be continuously improved is the efficiency of the Ethereum client software, which is much more optimized today than it was five years ago. An effective L1 Gas limit increase strategy will involve speeding up these verification techniques.
Another scaling strategy involves identifying specific functions and types of computation that can be made cheaper without compromising the decentralization of the network or its security properties. Examples of this include:
EOF - A new EVM bytecode format that is friendlier to static analysis and allows for faster implementations. Given these efficiencies, EOF bytecode can be given a lower gas cost.
Multi-dimensional Gas Pricing - Establishing separate base fees and limits for compute, data, and storage could increase the average capacity of Ethereum L1 without increasing its maximum capacity (and thus creating new security risks).
Lowering gas costs for certain opcodes and precompiles - Historically, we have done rounds of increasing gas costs for certain underpriced operations to avoid denial of service attacks. We have done less but can There is more to it, which is to reduce the gas cost of overpriced operations. For example, addition is much cheaper than multiplication, but the cost of the ADD and MUL opcodes is currently the same. We can make ADD cheaper and even simpler opcode (such as PUSH) is cheaper. EOF is more common overall.
EVM-MAX and SIMD : EVM-MAX ("Modular Arithmetic Extensions") is a proposal to allow more efficient native big-number modular math as a separate module of the EVM. Values computed by EVM-MAX calculations can only be Accessed by other EVM-MAX opcodes unless intentionally exported; this allows for more space to store these values in an optimized format. SIMD ("Single Instruction Multiple Data") is a technique that allows the same instruction to be executed efficiently on arrays of values. Proposed. Together, the two could create a powerful co-processor with the EVM that could be used to implement cryptographic operations more efficiently. This would be particularly useful for privacy protocols and L2 proof systems, so it would help with both L1 and L2 scaling.
These improvements will be discussed in more detail in a future article about Splurge.
Finally, the third strategy is native rollups (or “built-in rollups”): essentially, creating many copies of the EVM that run in parallel, resulting in a model equivalent to what rollups can provide, but more natively integrated into In agreement.
What are the connections with existing research?
Polynya’s Ethereum L1 expansion roadmap: https://polynya.mirror.xyz/epju72rsymfB-JK52_uYI7HuhJ-W_zM735NdP7alkAQ
Multi-dimensional Gas Pricing: https://vitalik.eth.limo/general/2024/05/09/multidim.html
EIP-7706: https://eips.ethereum.org/EIPS/eip-7706
EOF: https://evmobjectformat.org/
EVM-MAX: https://ethereum-magicians.org/t/eip-6601-evm-modular-arithmetic-extensions-evmmax/13168
SIMD: https://eips.ethereum.org/EIPS/eip-616
Native rollup: https://mirror.xyz/ohotties.eth/P1qSCcwj2FZ9cqo3_6kYI4S2chW5K5tmEgogk6io1GE
Interview with Max Resnick on the value of extending L1: https://x.com/BanklessHQ/status/1831319419739361321
Justin Drake on scaling with SNARKs and native rollups: https://www.reddit.com/r/ethereum/comments/1f81ntr/comment/llmfi28/
What else needs to be done and what trade-offs need to be made?
There are three strategies for L1 scaling, which can be performed individually or in parallel:
Improve technology (e.g. client code, stateless clients, history expiration) to make L1 easier to verify, then increase the gas limit
Reduce costs for specific operations and increase average capacity without increasing worst-case risk
Native rollups (i.e. “create N parallel copies of the EVM”, though presumably giving developers a lot of flexibility in the parameters of how the copies are deployed)
It’s worth understanding that these are different technologies with different tradeoffs. For example, native rollups have many of the same weaknesses as regular rollups in terms of composability: you can’t send a single transaction to synchronously perform operations across multiple transactions, like you can with Contracts can be processed on the same L1 (or L2) as before. Raising the gas limit would take away other benefits that could be achieved by making L1 easier to verify, such as increasing the percentage of users running validating nodes and increasing the number of individual stakers. Making a particular operation cheaper (depending on how it is performed) may increase the overall complexity of the EVM.
The big question that any L1 scaling roadmap needs to answer is: what is the ultimate vision for L1 and L2? Obviously, doing everything on L1 is absurd: potential use cases are hundreds of thousands of transactions per second, which would Making L1 completely unverifiable (unless we go the native rollup route). But we do need some guidelines so that we can ensure we don’t create a situation where we increase the gas limit 10x and severely undermine the decentralization of Ethereum L1. centralization and find that we have just entered a world where instead of 99% of activity being on L2, 90% of activity is on L2, so the results look pretty much the same, except that most of the specificity of Ethereum L1 is not Reversal of losses.
A proposed view on the “division of labor” between L1 and L2
How does it interact with the rest of the roadmap?
Getting more users into L1 means improving not only scaling but other aspects of L1 as well. This means more MEVs will remain on L1 (rather than just becoming an L2 problem), so there is a greater urgency to deal with them explicitly. It. It greatly increases the value of fast slot times on L1. It also relies heavily on smooth validation of L1 (“The Verge”).