Recently, the Sui blockchain also faced the predicament of a temporary halt in block production. After experiencing a two-and-a-half-hour block production halt, the Sui official also released a report on this incident. However, the high-performance public chain Sui, after experiencing a block production halt, also makes people associate it with Solana in the previous few years. Comparing the two, although they are vastly different in programming language and architecture, they are both touted as high-performance public chains, yet they have also been criticized for not being decentralized enough.
Table of Contents
ToggleWhy did a segment of congestion control code trigger the collapse of all validators
The report indicates that on November 21, 2024, around 1:15 am to 3:45 am Pacific Time, the Sui mainnet experienced a complete shutdown. All validators entered a collapse cycle, causing the entire network to be unable to process any transactions. This incident highlights that while high-performance public chains focus on improving performance, their stability still requires high attention.
According to the official statement, the reason for this shutdown was that a "assert!" in the congestion control code of the Sui network triggered the collapse of the validators. Specifically, when the following conditions are met simultaneously, it will lead to the network collapse:
- Congestion control is enabled in TotalGasBudgetWithCap mode.
- Receiving a transaction with the following characteristics: a mutable shared object as input, and no MoveCall instruction
When such a transaction enters the network, all validators collapse simultaneously, and the network falls into a standstill.
What is congestion control?
The object-oriented architecture of the Sui mainnet allows a large number of transactions to be processed in parallel, which is how it achieves high performance. However, if multiple transactions need to be written to the same shared object, they still need to be executed in sequence, and the processing speed of such transactions is limited. To avoid congestion caused by shared objects, Sui has introduced a congestion control mechanism to limit the transaction rate of a single shared object. The author adds: Previously, the Sui Foundation mentioned in an offline reading group with XueDAO that its logic is to package transactions with causal relationships and execute them together.
Recently, Sui has upgraded its congestion control system and introduced the TotalGasBudgetWithCap mode to more accurately assess the complexity of transactions. However, a vulnerability that caused this incident appeared in the code of this mode. The Sui team stated that after discovering the problem, they took immediate action and released mainnet v1.37.4 and testnet v1.38.1 updates through code fixes (PR #20365). The validator community demonstrated extremely high responsiveness, with the network recovering in just 15 minutes from the release of the fix.
Typus Protocol: Sui's block production halt is completely different from Solana
Sui's block production halt inevitably makes people associate it with Solana or even TON this year. Regarding this, Kyrie, the CGO of the Sui DeFi protocol Typus, also shared the team's views on Twitter post, directly stating that this is completely different from Solana's block production halt. Because Solana's problem is network congestion leading to system collapse, solving it requires large-scale architectural improvements, which also means that Solana's problem is difficult to fundamentally solve in the short term. However, Sui's problem this time is a specific technical issue and does not affect the system's basic architecture.
Kyrie said that the reason for the outage was an overflow in the calculation of transaction costs. In simple terms, it's like a computer display not having enough digits, and when the number becomes too large, it resets to zero and recalculates. The system gets stuck in an infinite loop in this situation, ultimately leading to the entire network coming to a standstill.
When the system's calculated value exceeds the storable range, the original design was to calculate incorrectly when exceeding the range, causing the system to repeat the calculation endlessly. However, PR #20365 has set the correct calculation limit to prevent this from happening. He also pointed out that the key to this incident is that the problem occurred in the program logic of transaction cost calculation, not in Sui's consensus mechanism or system architecture design. This also explains why the fix could be so quick and direct.
Franklin Templeton and Sui announce partnership
Just before the deadline, a piece of news came in that the day after the block production halt, the Sui Foundation announced a partnership with Franklin Templeton. In the statement, Franklin Templeton mentioned three protocols and infrastructures: Deepbook, Karrier One, and ika. However, based on Franklin Templeton's operations in the blockchain, we can perhaps expect the object-oriented and security-focused Sui blockchain to be combined with real-world assets (RWA).