Special thanks to Justin Drake, Hsiao-wei Wang, @antonttc, and Francesco for their feedback and review.
Initially, "The Merge" refers to the most important event in the history of the Ethereum protocol since its launch: the long-awaited and hard-won transition from Proof-of-Work to Proof-of-Stake. Now, Ethereum has been a stable Proof-of-Stake system for nearly two years, and this Proof-of-Stake has performed remarkably well in terms of stability, performance, and avoiding centralization risks. However, there are still some important areas that need to be improved in Proof-of-Stake.
My 2023 roadmap divides this into a few parts: improving technical features, such as stability, performance, and accessibility for smaller validators, as well as economic changes to address centralization risks. The former takes over the title of "The Merge", while the latter becomes part of "The Scourge".
This article will focus on the "Merge" part: what areas of the Proof-of-Stake technical design can be improved, and what paths can be taken to achieve these improvements?
This is not an exhaustive list of ways to improve Proof-of-Stake; rather, it is a list of ideas that are actively being considered.
Single-Slot Finality and Staking Democratization
What problem are we trying to solve?
Currently, it takes 2-3 epochs (about 15 minutes) to finalize a block, and 32 ETH is required to become a staker. This was initially a compromise made to balance three objectives:
· Maximize the number of validators that can participate in staking (which directly implies minimizing the minimum ETH required for staking)
· Minimize the time to finality
· Minimize the overhead for running a node
These three objectives are in tension with each other: to achieve economic finality (i.e., an attacker needs to destroy a large amount of ETH to revert finalized blocks), each validator needs to sign two messages every time finality is achieved. So if you have many validators, either it takes a long time to process all the signatures, or you need very powerful nodes to handle all the signatures concurrently.
Note that all of this is predicated on a key Ethereum objective: ensuring that even a successful attack will impose a high cost on the attacker. This is what the term "economic finality" means. If we didn't have this objective, we could solve this problem by randomly selecting a committee (as Algorand does) to finalize each slot, but the problem with this approach is that if an attacker does control 51% of the validators, they can attack at very low cost (revert finalized blocks, censor, or delay finality): only a portion of the committee nodes can be detected as participating in the attack and penalized, whether through slashing or a minority soft fork. This means the attacker can repeatedly attack the chain. So if we want economic finality, the simple committee-based approach doesn't work, and prima facie we do seem to need full validator participation.
Ideally, we want to retain economic finality while improving the status quo in two ways:
· Finalize blocks within a single slot (ideally, maintaining or even reducing the current 12-second length), rather than 15 minutes
· Allow validators to stake with 1 ETH (down from 32 ETH)
The first objective has two sub-objectives, both of which can be seen as "making Ethereum's properties more consistent with the properties of (more centralized) performance-focused L1 chains".
First, it ensures that all Ethereum users can benefit from the higher level of security provided by the finality mechanism. Currently, most users cannot enjoy this security because they are unwilling to wait 15 minutes; with single-slot finality, users can almost immediately see transactions finalized after confirmation. Second, if users and applications don't have to worry about the possibility of chain rollbacks (except for the relatively rare case of inactivity leaks), it simplifies the protocol and surrounding infrastructure.
The second objective is out of a desire to support solo stakers. Repeated polls have shown that the main factor preventing more people from solo staking is the 32 ETH minimum requirement. Lowering the minimum to 1 ETH would solve this problem, to the point where other issues become the main limiting factor for solo staking.
There is a challenge: the goals of faster finality and more democratized staking conflict with the objective of minimizing overhead. In fact, this is the very reason we didn't adopt single-slot finality from the start. However, recent research has proposed some possible ways to address this problem.
What is it and how does it work?
Single-slot finality involves using a consensus algorithm that finalizes blocks within a single slot. This is not an inherently difficult target to achieve: many algorithms (such as Tendermint Consensus) have already achieved this with optimal properties. A unique Ethereum desideratum that Tendermint doesn't support is inactivity leaks, which allow the chain to continue running and eventually recover even if more than 1/3 of the validators are offline. Fortunately, this desire has been met: there are already proposals to modify Tendermint-style consensus to accommodate inactivity leaks.
The most challenging part is figuring out how to make single-slot finality work at very high validator counts without leading to extremely high node operator overhead. There are several leading solutions for this:
Option 1: Brute force - work on achieving better signature aggregation protocols, possibly using ZK-SNARKs, which would actually allow us to handle signatures from millions of validators per slot.
Horn, one of the proposed designs for better aggregation protocols.
Option 2: Orbit committees - a new mechanism that allows randomly selected medium-sized committees to be responsible for finalizing the chain, but in a way that preserves the attack cost property we're looking for.
One way to think about Orbit SSF is that it opens up a compromise space, ranging from x=0 (Algorand-style committees, no economic finality) to x=1 (current Ethereum), with a middle ground where Ethereum still has sufficient economic finality to achieve extreme security, but we gain the efficiency advantage of only needing a medium-sized random sample of validators to participate in each slot.
Orbit leverages the pre-existing heterogeneity in validator deposit sizes to obtain as much economic finality as possible, while still giving smaller validators a corresponding role. Additionally, Orbit uses slow committee rotation to ensure a high degree of overlap between adjacent quorums, ensuring its economic finality still applies across committee rotations.
Option 3: Two-tier staking - a mechanism where stakers are divided into two tiers, one with a higher deposit requirement and one with a lower deposit requirement. Only the higher-deposit tier will directly participate in providing economic finality. There are various proposals on what the rights and responsibilities of the lower-deposit tier should be (e.g., see the Rainbow staking post), common ideas include:
· The right to delegate their stake to higher-tier stakers
· Random sampling of lower-tier stakers to attest and be required to complete each block
· The right to be included in the inclusion list
How does this relate to existing research?
· Paths to achieving single-slot finality (2022): https://notes.ethereum.org/@vbuterin/single_slot_finality
· Specific proposals for an Ethereum single-slot finality protocol (2023): https://eprint.iacr.org/2023/280
· Orbit SSF: https://ethresear.ch/t/orbit-ssf-solo-staking-friendly-validator-set-management-for-ssf/19928
· Further analysis of the Orbit style mechanism: https://notes.ethereum.org/@anderselowsson/Vorbit_SSF
· Horn, Signature Aggregation Protocol (2022): https://ethresear.ch/t/horn-collecting-signatures-for-faster-finality/14219
· Signature Merging for Large-Scale Consensus (2023): https://ethresear.ch/t/signature-merging-for-large-scale-consensus/17386?u=asn
· Signature Aggregation Protocol proposed by Khovratovich et al.: https://hackmd.io/@7dpNYqjKQGeYC7wMlPxHtQ/BykM3ggu0#/
· STARK-based Signature Aggregation (2022): https://hackmd.io/@vbuterin/stark_aggregationRainbow
· Staking: https://ethresear.ch/t/unbundling-staking-towards-rainbow-staking/18683
What is left to do? What tradeoffs need to be considered?
There are four main viable paths (we can also adopt a hybrid path):
· Maintain the status quo
· Orbit SSF
· Forceful SSF
· SSF with two-layer staking
(1) Means doing nothing, keeping staking as is, but this will make the security experience and staking centralization properties of Ethereum even worse than they are now.
(2) Avoid "high-tech" and solve the problem by cleverly rethinking protocol assumptions: we relax the "economic finality" requirement, so we require a high attack cost, but can accept an attack cost that may be 10 times lower than today (e.g., an attack cost of $2.5 billion instead of $25 billion). It is generally believed that Ethereum's economic finality today far exceeds the level it needs, and its main security risks are elsewhere, so this can be said to be an acceptable sacrifice.
The main work is to verify that the Orbit mechanism is secure and has the properties we want, then fully formalize and implement it. Additionally, EIP-7251 (increasing the maximum effective balance) allows voluntary validator balance merging, which will immediately reduce chain validation overhead and serve as an effective initial stage for the rollout of Orbit.
(3) Avoid the clever rethinking and forcefully solve the problem with high-tech. To do this requires collecting a large number of signatures (over 1 million) in a very short time (5-10 seconds).
(4) Avoids both the clever rethinking and high-tech, but it does create a two-layer staking system, still with centralization risks. The risks largely depend on the specific rights granted to the lower staking layer. For example:
If lower-level stakers need to delegate their proving rights to higher-level stakers, then the delegation may centralize, and we'll end up with two highly concentrated staking tiers. If random sampling of the lower tier is required to approve each block, then an attacker can spend a tiny amount of ETH to prevent finality. If lower-level stakers can only produce inclusion lists, then the proving tier may still be in a centralized state, where a 51% attack on the proving tier can censor the inclusion list itself.
Multiple strategies can be combined, for example:
· (1 + 2): Add Orbit without executing single-slot finality.
· (1 + 3): Use brute-force techniques to reduce the minimum deposit size without executing single-slot finality. The required aggregation amount is 64 times less than the pure (3) case, so the problem becomes easier.
· (2 + 3): Use conservative parameters to execute Orbit SSF (e.g., 128k validator committee instead of 8k or 32k) and use brute-force techniques to make it super-efficient.
· (1 + 4): Add Rainbow Staking without executing single-slot finality.
How does it interact with other parts of the roadmap?
In addition to other benefits, single-slot finality also reduces the risk of certain types of multi-block MEV attacks. Furthermore, in a single-slot finality world, the proposer-builder separation design and other intra-block production pipelines need to be designed differently.
The weakness of brute-force strategies is that they make it more difficult to shorten slot times.
Single Secret Leader Election
What problem are we trying to solve?
Today, which validator will propose the next block is known in advance. This creates a security vulnerability: an attacker can monitor the network, identify which validators correspond to which IP addresses, and launch a DoS attack on the validator just before they are about to propose a block.
What is it? How does it work?
The best way to solve the DoS problem is to hide the information about which validator will generate the next block, at least until the block is actually generated. Note that if we remove the "single" requirement, this is easy: one solution is to let anyone create the next block, but require a randao reveal less than 2^256/N. On average, only one validator would be able to satisfy this requirement - but sometimes there would be two or more, and sometimes zero. Combining the "secrecy" requirement with the "single" requirement has been a challenge.
The Single Secret Leader Election protocol solves this problem by using some cryptographic techniques to create a "blinded" validator ID for each validator, and then allowing many proposers to have a chance to rearrange and re-blind the blinded ID pool (similar to how a mixing network works). In each epoch, a random blinded ID is selected. Only the owner of that blinded ID can generate a valid proof to propose a block, but no one knows which validator the blinded ID corresponds to.
What are some existing research links?
· Dan Boneh's paper (2020): https://eprint.iacr.org/2020/025.pdf
· Whisk (a specific Ethereum proposal, 2022): https://ethresear.ch/t/whisk-a-practical-shuffle-based-ssle-protocol-for-ethereum/11763
· Single Secret Leader Election tag on ethresear.ch: https://ethresear.ch/tag/single-secret-leader-election
· Simplified SSLE using ring signatures: https://ethresear.ch/t/simplified-ssle/12315
What is left to do? What tradeoffs need to be considered?
Essentially, what's left is to find and implement a simple enough protocol so that we can easily deploy it on the mainnet. We value Ethereum being a relatively simple protocol, and we don't want complexity to increase further. We've seen SSLE implementations add hundreds of lines of specification code and introduce new assumptions in complex cryptography. Finding an efficient post-quantum SSLE implementation is also an open problem.
It may ultimately turn out that the "marginal additional complexity" of SSLE only becomes low enough when we introduce a general-purpose zero-knowledge proof mechanism on the L1 of the Ethereum protocol for other reasons (e.g., state tree, ZK-EVM).
Another option is to simply not bother with SSLE and use out-of-protocol mitigation measures (e.g., at the p2p layer) to address the DoS problem.
How does it interact with other parts of the roadmap?
If we add a proposer-builder separation (APS) mechanism, such as execution tickets, then the execution blocks (i.e., blocks containing Ethereum transactions) would not need SSLE, as we can rely on dedicated block builders. However, for consensus blocks (i.e., blocks containing protocol messages, such as proofs, possibly partial inclusion lists, etc.), we would still benefit from SSLE.
Faster Transaction Confirmation
What problem are we trying to solve?
Further reducing Ethereum's transaction confirmation time from 12 seconds to 4 seconds is valuable. This would significantly improve the user experience for both L1 and aggregation-based applications, while making DeFi protocols more efficient. It would also make it easier for L2s to decentralize, as it would allow a large number of L2 applications to operate on top of aggregation, reducing the need for L2s to build their own decentralized ordering based on committees.
What is it? How does it work?
There are roughly two technical approaches here:
· Reducing the slot time, e.g., to 8 seconds or 4 seconds. This does not necessarily mean 4 seconds of finality: finality inherently requires three rounds of communication, so we can have each communication round be a separate block, with at least preliminary confirmation 4 seconds later.
· Proposers are allowed to publish pre-confirmations during the slot process. In extreme cases, proposers can include the transactions they see in their blocks in real-time and immediately publish pre-confirmation messages for each transaction ("My first transaction is 0x1234...", "My second transaction is 0x5678..."). The situation where a proposer publishes two conflicting confirmations can be handled in two ways: (i) by slashing the proposer, or (ii) by using a voter vote to determine which one is earlier.
What are the links to existing research?
· Based on pre-confirmations: https://ethresear.ch/t/based-preconfirmations/17353
· Protocol Enforced Proposer Commitments (PEPC): https://ethresear.ch/t/unbundling-pbs-towards-protocol-enforced-proposer-commitments-pepc/13879
· Staggered periods on parallel chains (2018 idea for low latency): https://ethresear.ch/t/staggered-periods/1793
What remains to be done, and what are the trade-offs?
The practicality of reducing slot time is still unclear. Even today, in many parts of the world, it is difficult for stakers to obtain proofs quickly enough. Attempting a 4-second slot time risks a centralized validator set, and is impractical to be a validator outside of a few privileged regions due to latency.
The weakness of the proposer pre-confirmation approach is that it can greatly improve the average case inclusion time, but cannot improve the worst-case inclusion time: if the current proposer is running well, your transaction will be pre-confirmed in 0.5 seconds instead of (on average) 6 seconds to be included, but if the current proposer is offline or running poorly, you still have to wait a full 12 seconds for the next slot and a new proposer to provide.
Additionally, there is an open question of how to incentivize pre-confirmations. Proposers have an incentive to maximize their optionality for as long as possible. If validators sign off on the timeliness of pre-confirmations, then transaction senders could pay a premium for immediate pre-confirmation, but this would add an additional burden on validators and may make it harder for them to continue to act as a neutral "dumb pipe".
On the other hand, if we do not try this and keep the final finality time at 12 seconds (or longer), the ecosystem will put more emphasis on pre-confirmation mechanisms developed at Layer 2, and cross-Layer 2 interactions will take longer.
How does it interact with other parts of the roadmap?
Proposer-based pre-confirmations actually rely on the Attestation-Proposer Separation (APS) mechanism, such as execution tickets. Otherwise, the pressure to provide real-time pre-confirmations could put too centralized of a pressure on regular validators.
Other Research Areas
51% Attack Recovery
It is commonly assumed that if a 51% attack occurs (including non-cryptographically provable attacks, such as censorship), the community will come together to implement a minority soft fork, ensuring the good guys win and the bad guys are slashed or become inactive. However, this degree of reliance on the social layer can be argued to be unhealthy. We can try to reduce the dependence on the social layer, making the recovery process as automated as possible.
Full automation is impossible, as if it were, this would count as a >50% fault-tolerant consensus algorithm, and we already know the (very strict) mathematical provable limitations of such algorithms. But we can achieve partial automation: for example, if clients have reviewed transactions they have seen for long enough, clients can automatically refuse to accept a chain as finalized, or even refuse to accept it as a fork choice head.
Increasing Quorum Threshold
Today, a block is finalized if 67% of stakers support it. Some argue this is too aggressive. In the entire history of Ethereum, there has only been one (very brief) finality failure. If this percentage is increased to 80%, the additional non-finality periods would be relatively low, but Ethereum would gain security: in particular, many more contentious cases would lead to a temporary halt of finality. This seems much healthier than the "wrong side" immediately winning, whether the wrong side is the attacker or a client bug.
This also answers the question of "what is the point of solo stakers". Today, most stakers are already pooled, and it seems unlikely that solo stakers could reach the 51% staked ETH. However, if we try, it seems possible to have solo stakers reach the minority that can block the majority, especially if the majority reaches 80% (so the blocking minority only needs 21%). As long as solo stakers do not participate in a 51% attack (whether a finality reversal or censorship), such an attack would not get a "clean win", and solo stakers would actively help organize a minority soft fork.
Quantum Resistance
Metaculus currently believes that, though with large error bars, quantum computers are likely to start breaking cryptography at some point in the 2030s:
Quantum computing experts, such as Scott Aaronson, have also started to consider more seriously the possibility of quantum computers actually working in the medium term. This has implications for the entire Ethereum roadmap: it means that every part of the Ethereum protocol that currently relies on elliptic curve cryptography needs some kind of hash-based or other quantum-resistant alternative. This means in particular that we cannot assume we will be able to rely on the excellent properties of BLS aggregation to handle signatures from large validator sets forever. This justifies conservatism in the performance assumptions of proof-of-stake designs, and is also a reason to more proactively develop quantum-resistant alternatives.
Welcome to join the official BlockBeats community:
Telegram Subscription Group: https://t.me/theblockbeats
Telegram Discussion Group: https://t.me/BlockBeats_App
Twitter Official Account: https://twitter.com/BlockBeatsAsia