This article provides insights into how to simplify blockchain architecture and improve network efficiency and sustainability without sacrificing performance.
Original text: Possible futures of the Ethereum protocol, part 5: The Purge (vitalik.eth)
Author: vitalik.eth
Compiled by: ChoeyGit, 183Aaros, LXDAO
Translator’s Preface
As a follower and learner of blockchain technology and Ethereum ecosystem, I was very excited and looking forward to seeing the "The Purge" plan proposed by Vitalik. This plan proposes a unique development idea: in the context of blockchain's pursuit of functional expansion, it turns to improving network efficiency by optimizing and simplifying the system architecture. The plan mainly focuses on solving the problem of blockchain data expansion and simplifying the system complexity, in order to lower the threshold for participation and improve the sustainability of the network.
With the rapid development of blockchain technology, the difficulties of continuous growth of network data and increasingly complex system architecture are gradually emerging. Especially today when Layer 2 solutions are widely used, they provide higher scalability, but also bring additional complexity to the system. In this context, the "Purge" plan in this article proposes a new direction of thinking.
Can this technical route achieve effective slimming without affecting network performance? How to strike a balance between simplification and functionality? Please follow the article to explore these issues in depth.
Overview
This article is about 10,000 words and has 3 parts. It is estimated to take 50 minutes to read the article.
- History expiry
- State expiry
- Feature cleanup
Content
The Possible Future of the Ethereum Protocol (V): The Purge
Special thanks to Justin Drake, Tim Beiko, Matt Garnett, Piper Merriam, Marius van der Wijden, and Tomasz Stanczak for their feedback and reviews.
Ethereum faces a challenge: by default, the bloat and complexity of blockchain protocols will grow over time. This is mainly reflected in two aspects:
Historical data : Whenever transactions are generated (on-chain), accounts are created, they need to be permanently stored by all clients, and new clients also need to download all data when synchronizing their full nodes. This causes the client load and synchronization time to continue to increase even if the processing capacity of the chain remains unchanged.
Protocol features : It is much easier to add new features than to remove old ones, which causes code complexity to increase over time.
For the long-term sustainable development of Ethereum, we need to find a strong countermeasure to these two trends: reducing complexity and bloat over time . At the same time, we need to preserve the core feature of the blockchain: permanence . You can store an NFT, a love letter written in transaction data (calldata), or deploy a smart contract containing a million dollars on the chain. Even if you hide in a cave for ten years, you will find it is still there waiting for you to read and interact with it when you come out. Decentralized applications (dapps) need to be confident that the components on which their applications run will not cause destructive changes due to upgrades before they can safely achieve full decentralization and remove their upgrade keys—especially L1 itself.
Finding a balance between the need to maintain continuity while minimizing or reversing data bloat, complexity, and decay is something we believe is absolutely achievable if we invest in research in this area. Organisms can do this: while most people age over time, a few lucky ones do not. Even social systems can achieve super longevity. Ethereum has already demonstrated success in several areas: the proof-of-work mechanism (Pow) has been eliminated, the SELFDESTRUCT opcode has basically disappeared, and the beacon chain nodes now only store the last six months of historical data. This is the ultimate challenge for Ethereum in terms of long-term scalability, technical sustainability, and even security, aiming to find a more universal development path for Ethereum and move towards the ultimate goal of long-term stability.
The Purge: Key Goals
- Lower storage requirements for clients , by reducing or eliminating the need for each node to permanently store all historical data, and potentially even reducing reliance on storing state data.
- Reduce protocol complexity by eliminating unnecessary features.
In this chapter
- History expiry
- State expiry
- Feature cleanup
History Expiration
What problem are we trying to solve?
As of this writing, a fully-synced Ethereum node requires about 1.1 TB of disk space for executing clients, and a few hundred GB for storing consensus clients. The vast majority of this is historical data: historical blocks, transactions, and receipts, much of which is many years old. This means that even if the gas limit does not increase at all, the size of the node will still increase by hundreds of GB per year.
What is it and how is it done?
The history storage problem has a key simplifying property: since each block is linked to the previous block by hash (and other structures), this means that as long as there is consensus on the current state, there is consensus on the history. As long as the network reaches consensus on the latest block, any historical block, transaction, or state (account balance, nonce, code, storage) can be provided by any single participant, accompanied by a Merkle proof, which allows everyone except the single participant to verify its correctness. While consensus uses an N/2-of-N trust model, history is a 1-of-N trust model.
This gives us many options for how to store historical data. A natural choice is to have each node in the network store only a small portion of the data. This is like how torrent networks have operated for decades: while the entire network stores and distributes millions of files, each participant only stores and distributes a small portion of them. Perhaps counterintuitively, this approach does not necessarily reduce the robustness of the data. If, by reducing the cost of running nodes, we are able to achieve a network with 100,000 nodes, where each node randomly stores 10% of the historical data, then each piece of data will be replicated 10,000 times - this has exactly the same replication factor as a network with 10,000 nodes and each node stores all the data.
Today, Ethereum has begun to move away from a model where all nodes permanently store all historical data. Consensus blocks (i.e. the part related to proof-of-stake consensus) are only stored for about 6 months. Blobs are only stored for about 18 days. The EIP-4444 proposal aims to introduce a one-year storage period for historical blocks and receipts. The long-term goal is to establish a coordinated storage period (probably about 18 days) during which each node is responsible for storing all data, and then store older data in a distributed manner through a peer-to-peer network of Ethereum nodes.
Erasure codes can improve data robustness while maintaining the same replication factor. In fact, Blobs already use erasure codes to support data availability sampling. The simplest solution may be to reuse this erasure code technology and put the block data of the execution layer and consensus layer into Blobs as well.
What is the connection with current research?
- EIP-4444 Proposal: https://eips.ethereum.org/EIPS/eip-4444
- Torrent networks and EIP-4444 proposal: https://ethresear.ch/t/torrents-and-eip-4444/19788
- Portal Network: https://ethereum.org/en/developers/docs/networking-layer/portal-network/
- Portal Network and EIP-4444 proposal: https://github.com/ethereum/portal-network-specs/issues/308
- Distributed storage and retrieval of SSZ objects for Portal: https://ethresear.ch/t/distributed-storage-and-cryptographically-secured-retrieval-of-ssz-objects-for-portal-network/19575
- How to raise the gas fee limit (Paradigm): https://www.paradigm.xyz/2024/05/how-to-raise-the-gas-limit-2
What else is there to do, what trade-offs need to be made?
The main remaining work involves building and integrating a concrete distributed history storage solution - at least the execution layer history data, and eventually the consensus layer data and Blobs. The simplest solutions are two: (i) directly importing existing torrent libraries, and (ii) an Ethereum-native solution called Portal Network. Once either of these solutions are introduced, we can enable EIP-4444. EIP-4444 itself does not require a hard fork, but does require a new network protocol version. Therefore, it is valuable to enable this feature on all clients at the same time, otherwise there is a risk of failure when clients connect to other nodes and expect to download the full history data, but they are actually unable to obtain it.
The main trade-off is how hard we try to ensure the availability of "ancient" historical data. The simplest solution is to stop storing "ancient" historical data starting tomorrow and rely on existing archive nodes and various centralized service providers to replicate data. This is easy to do, but it will weaken Ethereum's position as a permanent record storage place. The more difficult but safer path is to first build and integrate torrent networks to store historical data in a distributed manner. There are two dimensions to "how hard we try" here:
- How much effort do we need to put in to ensure that the maximum number of nodes actually stores all the data?
- To what extent should we integrate historical data storage into the protocol?
For dimension (1), an extremely rigorous approach would involve proof of custody: actually requiring each proof-of-stake validator to store a certain percentage of the historical data, and to perform regular cryptographic checks to ensure that they have verified their storage. Another, more moderate approach would be to set a voluntary standard where each client voluntarily stores a certain percentage of the historical data.
For dimension (2), the basic implementation is to directly adopt existing results: Portal already stores ERA files containing the complete history of Ethereum. A more comprehensive implementation would be to actually connect it to the synchronization process, so that if someone wants to sync a node or archive node that stores the full history, they can sync directly from the Portal network even if there are no other archive nodes online.
How does it interact with the rest of the roadmap ?
If we want to make it extremely easy to run or start a node, then reducing historical data storage requirements is arguably more important than statelessness: of the 1.1 TB of storage required for a node, state data takes up about 300 GB, while historical data takes up the remaining ~800 GB. The vision of having an Ethereum node running on a smartwatch and being able to be set up in just a few minutes is only possible if both statelessness and EIP-4444 are implemented.
Limiting the storage of historical data makes it feasible for newer Ethereum nodes to only support the latest version of the protocol, which also makes node implementations simpler. For example, since the empty storage slots created during the 2016 DoS attack have all been removed, many of the related lines of code can now be safely removed. Similarly, now that the switch to Proof of Stake (PoS) is "ancient" history, clients can now safely remove all Proof of Work (PoW) related code.
Status Expired
What problem does it solve?
Even if we remove the need for clients to store historical data, client storage requirements will continue to grow, by about 50 GB per year, because state data continues to grow : account balances and nonces, contract code, and contract storage. Users only pay a one-time cost to impose a permanent storage burden on current and future Ethereum clients.
State data is harder to "expire" than historical data because the fundamental design of the EVM is built on the assumption that once a state object is created, it will exist forever and can be read by any transaction at any time. If we introduce statelessness, there is a view that this problem may not be so serious: only a certain class of block builders need to actually store state data, and all other nodes (even the generation of inclusion lists!) can operate in a stateless manner. However, another view is that we should not rely too much on statelessness, and we may still need state expiration in the end to ensure the decentralization of Ethereum.
What is it and how is it done?
Today, when you create a new state object (which can be done in one of three ways: (i) sending ETH to a new account, (ii) using a new account created from code, or (iii) setting a previously unused storage slot), the state object will exist in the state forever. What we want is for objects to automatically expire over time. To do this, we need to achieve three goals :
- Efficiency : Running the expiration process should not cost a lot of extra computation
- User-friendliness : If someone goes into a cave and reappears five years later, they shouldn’t lose access to their ETH, ERC20 tokens, NFTs, CDP positions, etc.
- Developer-friendly : Developers should not be required to switch to a completely foreign mental model. In addition, applications that are currently fixed and no longer updated should continue to work normally.
If you don't consider meeting these goals, it's actually easy to solve the problem. For example, you could have each state object store an expiration date counter (which can be extended by destroying ETH, and this extension can be done automatically on each read and write), and set up a process that loops through the state to delete expired state objects. However, this introduces additional computation (and even increases storage requirements), and is obviously not user-friendly. It is also difficult for developers to handle extreme cases where stored values are sometimes reset to zero. If the expiration timer is set at the contract level, this does make it easier for developers technically, but it brings more difficulties in terms of economics: developers need to consider how to "pass on" the ongoing storage costs to their users.
These issues have plagued the Ethereum core development community for many years, and proposals such as "blockchain rent" and "regenesis" have emerged. In the end, we integrated the best parts of these proposals and came up with two types of "known least worst cases":
- Local state expiration solution
- Proposal for a state data expiration mechanism based on address cycles
Local state expiration
All local state expiration proposals follow the same principle. We split the state data into multiple data blocks. Everyone permanently stores a "top-level mapping table" that marks which data blocks are empty or non-empty. The data in each data block is only stored if it has been recently accessed. There is also an "activation" mechanism: if a data block is no longer stored, anyone can restore it by providing proof of the data.
The main differences between these proposals are: (i) how "recent" is defined, and (ii) how a "data block" is defined. One specific proposal is EIP-7736, which is based on the "stem-and-leaf" design introduced for Verkle trees (although it is compatible with any form of statelessness, such as binary trees). In this design, adjacent headers, code, and storage slots are stored under the same "stem". The data stored under each stem is at most 256 * 31 = 7,936 bytes. In many cases, the entire header, code, and many key storage slots of an account are stored under the same stem. If the data under a stem has not been read or written within 6 months, the data is no longer stored, and instead only a 32-byte commitment ("stub") is stored. Future transactions that access this data need to "activate" the data and provide a proof that can be checked against the stub.
There are other ways to implement similar ideas. For example, if the account level is not granular enough, we can design a scheme where every 1/232 of the state tree is managed by a similar stem-and-leaf mechanism.
But this approach makes the incentive mechanism more difficult to implement: an attacker could force clients to store large amounts of state permanently by storing a large amount of data in a single child state tree, and then sending a transaction once a year to "update the state tree". If the update cost is set to be proportional to the size of the state tree (or the update duration is inversely proportional to the size of the tree), then an attacker could harass other users by storing large amounts of data in the same child state tree. We can try to limit both of these problems by having a dynamic granularity based on the size of the child state tree: for example, every consecutive 2^16 = 65536 state objects can be considered a "group". However, these ideas are more complicated; in contrast, the stem-based approach is simple and can align incentives well, because usually all data under a stem is related to the same application or user.
Proposal for a state expiration mechanism based on address cycles
What if we want to avoid permanent state growth entirely, and don't even want to keep a 32-byte stub? This is a tricky problem because of activation conflicts: suppose a state object is removed, and a subsequent EVM execution places another state object in the exact same place, and someone who cares about the original state object comes back and tries to restore it, what happens? In the partial state expiration scheme, the "stub" prevents new data from being created. But in the full state expiration scheme, we can't even store the stub.
The address-period-bssed design is the best known solution to this problem. Instead of using a state tree to store the entire state, it maintains a continuously growing list of state trees, and any state that is read or written is saved in the latest state tree. A new empty state tree is added every period (for example: 1 year). Older state trees are completely frozen. Full nodes only need to store the two most recent state trees. If a state object has not been accessed in two periods and therefore falls into an outdated state tree, it can still be read or written, but the relevant transaction needs to provide a Merkle proof for it - once the proof is successful, a copy of the state will be saved in the latest state tree again.
A key concept that makes this all user- and developer-friendly is the concept of "address cycles". An address cycle is a portion of an address. A key rule is that an address with address cycle N can only be read or written during or after cycle N (i.e. when the state tree list length reaches N) . If you are saving a new state object (e.g. a new contract or a new ERC20 balance), just make sure to put that state object into a contract with address cycle N or N-1, and you can save it immediately without providing proof that the location was empty before. On the other hand, any additions or edits to the state of an earlier address cycle require proof.
This design retains most of the current features of Ethereum, with little additional computational overhead, and allows applications to be written almost as they are now (though ERC20 needs to be rewritten to ensure that the balance of an address with an address cycle of N is stored in a subcontract with the same address cycle of N), and solves the problem of "users entering a cave for five years". However, it has a major problem: the address needs to be expanded to more than 20 bytes to accommodate the address cycle .
Address space extension
One proposal is to introduce a new 32-byte address format that contains a version number, an address period number, and an extended hash value.
The red part is the version number. The four orange zeros here are reserved spaces that can be used to store shard numbers in the future. The green part is the address cycle number. The blue part is the 26-byte hash value.
The key challenge here is backward compatibility. Existing contracts are designed based on 20-byte addresses, and often use compact byte packing techniques that explicitly assume that addresses are exactly 20 bytes long. One idea to solve this problem is to use a conversion map so that old contracts that interact with new addresses see the 20-byte hash of the new address. However, there are many complex issues involved in ensuring the security of this approach.
Address space shrinkage
Another approach does the opposite: we immediately disable a subrange of addresses of size 2^128 (eg: all addresses starting with 0xffffffff), and then use that range to introduce addresses containing the address cycle and the 14-byte hash value.
The main cost of this approach is that it introduces security risks for counterfactual addresses : addresses that hold assets or permissions but whose code has not yet been published to the chain. The risk is that someone could create an address claiming to own a piece of (unpublished) code, but there is another valid piece of code that hashes to the same address. Currently, calculating such a collision requires 2^80 hash operations; address space shrinkage would reduce this number to a very achievable 2^56 hash operations.
The main risk area is counterfactual address wallets that are not held by a single owner, which is a relatively rare situation today, but may become more common as we enter the L2 multi-layer expansion (multi-L2) world. The only solution is to accept this risk and identify all common use cases where this problem may occur and come up with effective avoidance solutions.
What are the connections with existing research?
Early Proposals
- Blockchain rent: https://github.com/ethereum/EIPs/issues/35
- Activate Regenesis: https://ethresear.ch/t/regenesis-resetting-ethereum-to-reduce-the-burden-of-large-blockchain-and-state/7582
Ethereum state size management theory:
https://hackmd.io/@vbuterin/state_size_management
Several possible paths to achieve statelessness and state expiration:
https://hackmd.io/@vbuterin/state_expiry_paths
Some status expired proposals
EIP-7736: https://eips.ethereum.org/EIPS/eip-7736
Address Space Extension Documentation
- Original proposal: https://ethereum-magicians.org/t/increasing-address-size-from-20-to-32-bytes/5485
- Ipsilon review: https://notes.ethereum.org/@ipsilon/address-space-extension-exploration
- Blog post review: https://medium.com/@chaisomsri96/statelessness-series-part2-ase-address-space-extension-60626544b8e6
What problems does losing collision resistance cause:
https://ethresear.ch/t/what-would-break-if-we-lose-address-collision-resistance/11356
What else is there to do, and what trade-offs need to be made?
I see four possible paths forward:
Go stateless without introducing state expiration mechanisms . The state data will continue to grow (albeit slowly: it may not exceed 8 TB for decades), but only needs to be held by a relatively specialized group of users: not even proof-of-stake (PoS) validators need to hold state. The only function that requires access to part of the state is to generate inclusion lists , but we can achieve this in a decentralized way: each user is responsible for maintaining the part of the state tree that contains their own accounts. When a user broadcasts a transaction, the proof of the state objects accessed in the verification step is also broadcast (this applies to externally owned accounts EOA and ERC-4337 accounts). Stateless validators can then combine these proofs into a proof of the complete inclusion list.
Implementing a local state expiration mechanism that accepts a significantly reduced but still non-zero permanent state size growth rate. This result is arguably similar to the case of the history expiration proposal involving peer-to-peer networks: each client needs to store a lower but fixed fraction of the history, accepting a reduced but non-zero permanent history storage growth rate.
State expiration is implemented while address space expansion is ongoing . This process will continue for many years to ensure that the address format conversion method is feasible and safe, including security guarantees for existing applications.
Implement state expiration mechanism and shrink address space at the same time. This process will continue for many years to ensure that all security risks related to address collision are handled, including risks in cross-chain scenarios.
An important point is that these thorny problems around address space expansion and contraction will eventually need to be solved, regardless of whether or not a state expiration scheme that relies on address format changes is implemented . Currently, generating an address collision requires about 2^80 hash operations, a computational load that is feasible for extremely well-resourced actors: a GPU can perform about 2^27 hash operations per second, so 2^52 operations in a year, so about 2^30 GPUs worldwide can calculate a collision in about 1/4 of a year, and FPGAs and ASICs can speed this up further. In the future, this attack will be open to more and more people. Therefore, the actual cost of implementing full state expiration may not be as high as it seems, because we have to solve this extremely challenging address problem anyway.
How does it interact with the rest of the roadmap?
Implementing a state data expiration mechanism may make transitioning between state trie formats easier, as no conversion process is required: you can just create a new state trie in the new format, and then convert the old state trie via a hard fork. So while the state data expiration mechanism is complex, it does bring benefits in other areas of the roadmap that simplify it.
Feature cleanup
What problem are we trying to solve?
One of the key prerequisites for security, accessibility, and trusted neutrality is simplicity. If a protocol is elegant and simple, it is less likely to have vulnerabilities. This also increases the likelihood that new developers will be able to come in and work on any part of it. At the same time, this simplicity makes the protocol more likely to be fair and more resistant to special interests. Unfortunately, protocols, like any social system, tend to become more complex over time by default. If we don’t want Ethereum to fall into a black hole of increasing complexity, we need to do one of two things: (i) stop making changes and let the protocol solidify , or (ii) be able to actually remove some features and reduce complexity . A middle path is also possible: make fewer changes to the protocol while gradually reducing some complexity over time. This section discusses how to reduce or remove complexity.
What is it and how is it done ?
There is no single large solution that will reduce protocol complexity; the nature of the problem is that there are many small fixes.
An example that is almost complete and can serve as a reference for handling other similar situations is the removal of the SELFDESTRUCT opcode. The SELFDESTRUCT opcode was the only opcode that could modify an unlimited number of storage slots in a single block, requiring clients to implement more complexity to avoid DoS attacks. The original purpose of this opcode was to implement voluntary state cleanup so that the state data size could be reduced over time. But in practice few people ended up using it. In the Dencun hard fork, the opcode was weakened to only allow self-destruction of accounts created in the same transaction. This solved the DoS problem and significantly simplified the client code. In the future, it may be reasonable to remove this opcode completely.
Some of the key opportunities for protocol simplification that have been identified include the following: First, some examples outside of the EVM; these changes are relatively low-impact and therefore easier to reach consensus and implement in the short term.
- RLP to SSZ conversion : Originally, Ethereum objects were serialized using an encoding called RLP. RLP is untyped and unnecessarily complex. Today, the beacon chain uses SSZ, which has significant advantages in many ways, supporting not only serialization but also hashing. We hope to eventually abandon RLP completely and convert all data types to SSZ structures, which in turn will make upgrades easier. Current relevant EIP proposals include [1] https://eips.ethereum.org/EIPS/eip-6465 [2] https://eips.ethereum.org/EIPS/eip-6465 [3] https://eips.ethereum.org/EIPS/eip-6465
- Removing legacy transaction types : There are currently too many transaction types, many of which can be removed. A more modest alternative to removing them completely is to introduce an account abstraction feature that allows smart accounts to optionally include code for handling and validating legacy transactions.
- Log transformation : Logs create bloom filters and other logic that add complexity to the protocol, but are too slow for clients to actually use. We can remove these features and instead invest in developing alternatives, such as out-of-protocol decentralized log reading tools using modern techniques like SNARKs.
- The Beacon Chain eventually removes the Sync Committee mechanism : The Sync Committee mechanism was originally introduced to enable light client verification in Ethereum. However, it adds significant complexity to the protocol. Eventually, we will be able to verify the Ethereum consensus layer directly using SNARKs, which will no longer require a specialized light client verification protocol. The consensus change may allow us to remove the Sync Committee earlier by creating a more "native" light client protocol that involves verifying the signatures of a random subset of Ethereum consensus validators.
- Data format unification : Currently, execution state is stored in Merkle Patricia trees, consensus state is stored in SSZ trees, and blobs are committed using KZG commitments. In the future, it would make sense to have unified formats for block data and state data respectively. These formats would meet all important requirements: (i) simple proofs for stateless clients, (ii) serialization and erasure coding of data, and (iii) standardized data structures.
- Removing the Beacon Chain Committee : This mechanism was originally introduced to support a specific version of the implementation of sharding. However, we ultimately implemented sharding through the second layer network and data blocks. As a result, committees became unnecessary and work is currently underway to remove them.
- Remove mixed endianness : The EVM is big-endian, while the consensus layer is little-endian. It would make sense to reunite and bring everything to one of those endiannesses (probably big-endian, since the EVM is harder to change).
Now, let's look at some examples inside the EVM:
- Simplification of gas mechanism : Current gas rules are not very optimized in limiting the amount of resources required to validate a block. Key examples include: (i) the cost of reading and writing storage, which is supposed to limit the number of reads and writes in a block, but is currently implemented rather arbitrarily; and (ii) the rules for memory padding, where it is currently difficult to estimate the maximum memory consumption of the EVM. Proposed fixes include the stateless gas cost change (which unifies all storage-related costs into a simple formula), and this proposal for memory pricing.
- Removal of precompiled contracts : Many of Ethereum's current precompiled contracts are both unnecessarily complex and rarely used, and account for a large proportion of near-consensus failures, but no applications actually use them. There are two ways to deal with this problem: (i) remove the precompiled contract directly, and (ii) replace it with EVM code that implements the same logic (which inevitably increases costs). This draft EIP proposes to do this first for the root identity precompiled contract; after that, RIPEMD 160, MODEXP, and BLAKE may become candidates for removal.
- Gas observability removal : It is no longer possible to see the total amount of gas remaining during EVM execution. This will affect the operation of some applications (most notably sponsored transactions), but will make future upgrades easier (for example, multi-dimensional gas for more advanced versions). The EOF specification already makes gas unobservable, but for the purpose of protocol simplification, EOF needs to become a mandatory specification.
- Optimization for static analysis : Current EVM code is difficult to statically analyze, especially because jumps can be dynamic. This also makes it harder to optimize EVM implementations (precompile EVM code into other languages). We can solve this problem by removing dynamic jumps (or making them more expensive, e.g. making the gas cost linear in the total number of JUMPDESTs in the contract). EOF already achieves this, but to gain the benefits of protocol simplification from it, EOF needs to be enforced.
What are the connections with existing research?
- Next phase of the Purge: https://notes.ethereum.org/I_AIhySJTTCYau_adoy2TA
- Contract destruction: https://hackmd.io/@vbuterin/selfdestruct
- SSZ-related EIPS:
- https://eips.ethereum.org/EIPS/eip-6493
- https://eips.ethereum.org/EIPS/eip-6466
- https://eips.ethereum.org/EIPS/eip-6465
- Stateless gas cost changes: https://eips.ethereum.org/EIPS/eip-4762
- Current memory pricing: https://notes.ethereum.org/ljPtSqBgR2KNssu0YuRwXw
- Removal of precompiled contracts: https://notes.ethereum.org/IWtX22YMQde1K_fZ9psxIg
- Bloom filter removal: https://eips.ethereum.org/EIPS/eip-7668
- Method for off-chain security log retrieval using incremental verifiable computation (i.e. recursive STARKs): https://notes.ethereum.org/XZuqy8ZnT3KeG1PkZpeFXw
What else is there to do, and what trade-offs need to be made ?
The main trade-offs in making these kinds of feature simplifications are (i) how much and how fast we can simplify vs. (ii) backward compatibility. Ethereum’s value as a blockchain is that it is a platform on which developers can deploy applications with the confidence that they will continue to work for many years to come. But at the same time, this ideal can be taken too far. To paraphrase William Jennings Bryan, it is “crucifying Ethereum on the cross of backward compatibility”. If there are only two applications in the entire Ethereum that use a particular feature, and one of them has no users in years, and the other is almost completely unused and has only $57 of value locked up, then we should just remove the feature and, if necessary, pay the affected users $57 directly.
The broader social problem is how to establish a standardized process for making non-urgent, backwards-incompatible changes. One approach is to study and extend existing precedents, such as the process for SELFDESTRUCT (contract destruction). This process is roughly as follows:
- Step 1: Start a discussion about removing feature X
- Step 2: Perform an analysis to determine how removing X would affect the application, and based on the analysis, choose to: (i) abandon the idea, (ii) proceed as planned, or (iii) find a “least disruptive” modification to remove X and move forward
- Step 3: Propose a formal EIP to deprecate X. Ensure that popular upper-layer infrastructure (e.g. programming languages, wallets) follows this change and stops using the feature
- Step 4: Finally, actually remove the X
There will be a multi-year process from step 1 to step 4, with a clear indication of which step each project is currently in. During this process, there will be a trade-off between the force and speed of the feature removal process and taking a more conservative approach and investing more resources in other areas of protocol development. However, we are still a long way from the Pareto frontier.
EOF
The EVM Object Format (EOF) is a proposed set of major changes to the EVM. EOF introduces a lot of changes, such as disabling gas observability, disabling code observability (i.e., not allowing CODECOPY), allowing only static jumps, etc. Its goal is to allow the EVM to be upgraded more to achieve more powerful features while maintaining backward compatibility (because pre-EOF versions of the EVM will continue to exist).
The advantage of this approach is that it provides a natural path for adding new EVM features and encourages migration to a stricter EVM with stronger guarantees. The disadvantage is that it will significantly increase protocol complexity unless we can find a way to eventually deprecate and remove the old EVM. An important question is: what role does EOF play in the EVM simplification proposal, especially when the overall goal is to reduce EVM complexity?
How does it interact with the rest of the roadmap?
Many of the "optimization" proposals in the roadmap are also opportunities to simplify older features. To repeat some of the examples above:
- Switching to single-slot finality gives us the opportunity to remove committees, restructure the economic model, and make other proof-of-stake related simplifications.
- Fully implementing account abstraction allows us to remove a lot of existing transaction processing logic by moving it into a piece of "default account EVM code" that all EOAs can replace.
- If we migrate the Ethereum state to a binary hash tree, this can be reconciled with the new version of SSZ so that all Ethereum data structures are hashed in the same way.
A more radical approach is to convert large parts of the protocol into contract code.
A more radical strategy for simplifying Ethereum is to keep the protocol itself intact, but transform most of its content from protocol features into contract code.
The most extreme version of this approach is to have Ethereum L1 "technically" exist only as a beacon chain, while introducing a minimal virtual machine (such as RISC-V, Cairo, or some more simplified virtual machine dedicated to the proof system) so that anyone can create their own rollup. The EVM will then be transformed into the first of these rollups. Interestingly, this result is exactly the same as the execution environment proposal from 2019-20, but SNARKs make it much more likely to actually implement this solution.
Disclaimer: As a blockchain information platform, the articles published on this site only represent the personal opinions of the author and the guest, and have nothing to do with the position of Web3Caff. The information in the article is for reference only and does not constitute any investment advice or offer. Please comply with the relevant laws and regulations of your country or region.
Welcome to join the Web3Caff official community : X (Twitter) account | WeChat reader group | WeChat public account | Telegram subscription group | Telegram exchange group