Author: Shinobi
Segregated Witness (whose BIP authors are Pieter Wuille, Eric Lombrozo, and Johnson Lau) and Taproot (whose BIP authors are Pieter Wuille, Jonas Nick, Tim Ruffing, and Anthony Towns) are the two biggest changes to the Bitcoin protocol to date.
The former fundamentally changed the structure of Bitcoin transactions (and thus the Bitcoin block) to address the inherent limitations of the previous transaction structure. The latter restructured aspects of Bitcoin's scripting language (the way complex scripts are constructed and verified) and introduced a new scheme for generating cryptographic signatures.
Both are significant changes; in comparison, adding an opcode like CHECKTIMELOCKVERIFY (CLTV) merely allows the recipient to choose not to spend a payment before a specific time, which is not a major issue. (Translator's note: CLTV is a script-level relative time lock, added to the Bitcoin protocol in the 2015 BIP65 soft fork.)
These changes are intended to address the most fundamental shortcomings and limitations of Bitcoin as a system. As a foundational layer for maintaining global consensus on the overall state of Bitcoin (i.e., all unspent coins), the Bitcoin protocol is an invaluable and ingenious innovation. However, it is far from sufficient to enable everyone to directly transact with these coins.
In the years since Segregated Witness and Taproot activation, many of the shortcomings they addressed have been forgotten. The reasons and philosophies behind their design choices have also gradually become distorted through word of mouth over time.
Both of these changes are solutions to major problems in the Bitcoin protocol, but they also lay the foundation for solving other problems and making other optimizations in the future.
With many newcomers joining the Bitcoin network, it is worthwhile to review these two changes and explain the context of their design choices.
Isolation Witness (BIP 141 1 )
When a Bitcoin transaction spends some coins, it indexes those coins using the identifier (TXID) of the transaction that created them, and the index number of those coins in the output queue. This ensures that each transaction output can be uniquely identified and verifies with absolute certainty that they have not been previously spent.
Before the Segregated Witness upgrade, the structure of Bitcoin transactions was as follows:
[Version] [Inputs] [Outputs] [Locktime]交易版本号输入输出交易锁定时间The transaction identifier is a hash value of this data. The problem is that the input "script signature (ScriptSig)" (signature, hash preimage, etc.), which is used to prove that the transaction (spending operation) is valid, is also treated as part of the input. You can slightly change the program instructions in a script signature, or even change the cryptographic signature, without making it invalid.
These "melting operations" change the TXID. This causes significant problems for pre-signed transactions.
Lightning Network, Ark, Spark, BitVM, and Cautious Log Contracts (DLC)—all these scaling tools rely on pre-signed transactions. They require participants to first create a funding transaction to be signed, pre-sign all transactions that guarantee proper contract execution and fund security, and then sign and confirm the funding transaction. All these systems also use multi-signature authentication to ensure funds are not spent twice (this is important, and we will discuss it later).
If a funding transaction is melted down and its transaction ID is changed before it receives block confirmation, then all pre-signed transactions protecting Layer 2 funds will be invalidated. If anyone can change the TXID of your funding transaction during its propagation, then all the tools mentioned above are useless.
Segregated Witness uses an undefined opcode as a mask to replace the data previously placed in the input script signature, moving all of this data to a new field in the transaction called "witness". The new transaction structure is as follows:
[Version] [Marker/Flag] [Inputs] [Outputs] [Witness] [Locktime]交易版本号标记输入输出见证交易锁定时间This "blinds" in the input allows older nodes (those not using SegWit rules) to treat anything associated with them as valid by default, while newer nodes (those using SegWit rules) will apply the appropriate verification logic as needed. The traditional TXID will no longer change due to changes in the witness data. This solves the problem for pre-signed transactions and opens the door to the various scaling solutions developed today.
However, the Merkle tree of transactions in the block header only commits to the traditional TXIDs of each transaction entering the block, which raises a new problem: the block makes no commitment to witness data. Therefore, we need witness commitments, as well as "Witness Transaction IDs (WTXIDs)". Just like the Merkle tree construction of regular TXIDs, the WTXIDs of each transaction form a Merkle tree, and the commitment is placed in the witness of the coinbase transaction. (Translator's note: This is incorrect; the Merkle tree root of the WTXIDs should be placed in the script public key of an output of the coinbase transaction.)
The only difference is that the root of this tree is hashed along with a reserved value, and this reserved value is placed in the witness of the coinbase transaction. This allows this value to be used in the future for other new data fields in the commitment consensus rules. Before this witness tree commitment (the idea came from Luke Dashjr) was invented, it was thought that Segregated Witness would require a hard fork because the transaction structure would change and a dedicated witness commitment would be needed in the block header.
This "Venetian blind" design also makes any upgrade to the scripting system possible, because all new data will be ignored by nodes that do not use the new rules and will not be verified by them. This allows us to design a new scripting system that bypasses all the limitations of the old scripting system. This flexibility in upgrade paths also makes it possible to integrate Schnorr signatures and allows us to incorporate quantum-resistant signature schemes (if necessary) (the public key size of quantum-resistant signature schemes is generally larger than the 520-byte size limit for a single data object in old scripting systems, and the same applies to signatures).
Segregated Witness solves the fundamental problem of transaction ID melting, allowing scalable layer-2 protocols to operate freely and bring Bitcoin to more users; but at the same time, it also lays the foundation for any scripting system optimizations that are necessary to support and improve these layer-2 protocols.
Schnorr Signature 2
The Schnorr signature scheme was invented by Claus Schnorr in 1991, and he immediately obtained a patent for it. In fact, it was precisely because Schnorr's signature was patented that the ECDSA signature scheme was invented. Schnorr's patent expired in February 2010, just over a year after the launch of the Bitcoin network.
Without that patent, Satoshi Nakamoto (and others in the world) might have used Schnorr signatures in the first place.
Compared to ECDSA, Schnorr signatures have several key advantages:
- Schnorr signatures are provably secure. The mathematical evidence proving that Schnorr signatures are unforgeable/unbreakable is stronger and requires fewer assumptions (compared to ECDSA). Giving the cryptography at the heart of Bitcoin stronger security guarantees is clearly a huge benefit.
- Schnorr signatures are inherently unforgeable, which means that the problem of being able to replace a signature without invalidating a transaction, which exists when using ECDSA, is completely impossible when using Schnorr signatures.
- Schnorr signatures possess a "segmentation property," allowing for simple and efficient construction of additive cryptography, distributed key generation, and distributed signature generation. This allows users to directly "add together" individual Schnorr public keys and then, as a team, create a signature for that aggregated public key.
Schnorr signatures are more secure than ECDSA, cannot be forged by third parties, and open the door to using all types of efficient and flexible cryptographic schemes to enhance multi-signature authentication.
In the previous discussion on transaction meltability, I mentioned that all off-chain protocols using pre-signed transactions rely on multi-signature authentication to protect user funds. While this is a security measure for shared control of funds, it implicitly limits the scalability it can achieve. Traditional multi-signature methods cannot be scaled down. The transaction size itself is limited; there are also limitations on the size of witness data for version 0 (Segregated Witness). If a multi-signature address can only allow so many participants, it implies that only that many participants can share control of funds (thus capping scalability).
Schnorr-based signature schemes relax this restriction by aggregating multiple public keys into a single collective public key: we no longer need to construct scripts that explicitly include each member's public key. Before the Segregated Witness upgrade, a multi-signature address could only have 15 participants; after Segregated Witness, the expanded limit can accommodate 20 participants.
However, when using Schnorr-based signature schemes such as "MuSig" 5 and "FROST" 6 , these limitations completely disappear (at least at the consensus rule level). The number of people participating in a multi-signature device can be as large as desired, as long as the group can coordinate the signing process and the signing process is not interrupted by people refusing to sign or quitting.
(Translator's note: MuSig is a multi-signature scheme based on Schnorr, in which everyone who participates in the aggregation of public keys must participate in signing; while FROST is a threshold signature scheme, in which a signature can be generated as long as the number of public keys participating in the signing reaches a pre-set threshold.)
These same properties allow us to aggregate keys in this way, and also enable efficient adapter signatures: in these schemes, we can create signatures that are temporarily invalid but become effective once a piece of secret information is exposed. These properties also make some zero-knowledge proof-based schemes possible, where the signer can sign messages they haven't seen.
Taproot 3 4
"Taproot" is an upgraded version of an older concept called "Merkel Abstract Syntax Tree (MAST)" 7. MAST itself is simply a plugin for the "Pay-to-Script Hash (P2SH)" script 8. P2SH was developed to address two major problems:
- When using large, custom locking scripts (if such scripts are placed directly in the transaction output), the resulting unspent outputs will be larger (compared to using small locking scripts), requiring more space to store the UTXO set (i.e., the latest state of the Bitcoin network).
- When using large custom locking scripts, senders pay higher fees because the transactions they initiate are larger, which discourages people from paying for potentially more secure custom scripts.
The idea behind P2SH is to include only a hash of the script in the transaction's output, rather than explicitly including the entire script. When spending the output, the owner of the output (the recipient of the previous transaction) provides the complete script in the input, which can be verified using the hash. This solves the problem of unspent outputs taking up storage space and makes the financial burden of using large scripts borne by the users themselves (rather than by the person who pays them).
However, this leaves a problem. Customized scripts may contain multiple spending methods, but when spending, the spender still needs to reveal the entire script, including branches irrelevant to verifying the validity of the spending operation. Therefore, the space efficiency of customized scripts decreases with script size, incurring costs for the spender that exceed what is necessary.
The idea behind MAST is that we can break down a multi-branch script into individual cost conditions and then construct a Merkle tree using these cost conditions. Each cost condition is hashed, and the root of the Merkle tree is the user's address. When spending, the user only needs to provide the cost condition they are using, Merkle evidence proving that the condition exists in the tree, and the necessary data to satisfy the cost condition.
This Merkle tree structure not only solves all the problems that P2SH attempts to address, but also optimizes the cost for users (while enhancing their privacy!).
Taproot adopts this concept and integrates it in a more privacy-preserving way by leveraging the linearity of Schnorr signatures. In most contracts that people want, there's an optimistic outcome: all users agree on how to divide funds. In this case, they can directly sign transactions. Taproot uses the root of the MAST to "fine-tune" a Schnorr public key, generating a new public key; by simply "fine-tuning" the private key using the same MAST root, the corresponding private key for this new public key can be obtained.
In this way, users can either directly spend their output using the modified key without leaving any trace of the MAST tree, or reveal the original public key, the MAST root, and a spending condition stored on the tree. Furthermore, if you don't want to use the modified key directly at all, you can use a special NUMS (without backdoors) point as the original public key, which can be proven unspendable (no one knows the private key behind it), thus leaving only the script on the MAST as the valid spending path.
In addition to the design choice of using Segregated Witness, Taproot also introduced "tapscript," a new scripting system. Compared to previous scripting systems, its main change is the abandonment of OP_CHECKMULTISIG and OP_CHECKMULTISIGVERIFY (opcodes originally used for multi-signature verification), replacing them with OP_CHECKSIGADD , which can verify multiple signatures more efficiently. Combined with Schnorr key aggregation, this results in multi-signature functionality similar to older scripting systems.
In addition, tapscript modified OP_CHECKSIG and OP_CHECKSIGVERIFY to only verify Schnorr signatures, and added OP_SUCESS as a replacement for OP_NOP (an undefined opcode in older scripts). OP_SUCCESS is designed to allow for cleaner and safer opcode upgrades (compared to OP_NOP ).
Witness data volume limit
So far, there are still two aspects that have not been discussed: the "blockweight" limit added in the Segregated Witness upgrade, and the "witness size limit" that was improved in the Taproot upgrade.
Both decisions sparked controversy among a small group of very active, experienced users within the ecosystem. I won't discuss the increase in block size, which was part of the block weight limit and a compromise at the time to users with differing opinions (who were pushing for a hard fork to increase the block size limit), and was considered safe by network participants at the time; however, the witness discount mechanism is important.
Horizontal comparisons of Bitcoin transaction fees are based on the amount of data contained in a transaction, regardless of the value transferred. It is entirely determined by the quantity and size (in bytes) of the inputs, outputs (and the witness). As I mentioned earlier, prior to the Segregated Witness upgrade, script signatures (or signatures and other data) were included in the transaction output. This was a large chunk of data, present in the inputs but not in the outputs.
This means that, from the perspective of a single transaction, the input is more expensive than the output , and significantly more so. This creates a long-term incentive for users to spend larger denomination outputs and generate change outputs, rather than spending a large number of accumulated small denomination outputs. In other words, this is a long-term economic incentive that encourages users to continuously expand the UTXO set—and storing the UTXO set is a prerequisite for all full nodes to verify transactions and blocks.
Witnessing the data discount's starting point is to correct this price difference, narrowing it rather than widening it. This is extremely important to economically incentivize responsible UTXO management, at least in theory, for economically rational users who simply want to use the Bitcoin protocol.
The Taproot upgrade removed the data size limit for the witness field of a transaction. After the Segregated Witness upgrade, this limit was 10,000 bytes. This was done because Taproot's design already mitigated the possibility of constructing transactions that were difficult to verify, and attempting to add such a size limit to tapscript would have introduced significant complexity into "Miniscript." The problem this limit was meant to prevent did not affect Taproot, but it would have added complexity to a tool intended to make custom scripts more secure and user-friendly (so the developers decided to remove it).
(Translator's note: Miniscript is a programming language that allows for the structured combination of fragments of Bitcoin scripts, making the entire locking script easier to analyze and more secure.)
panoramic
Both of these changes removed significant obstacles to expanding Bitcoin and enabling more people to use it in a self-custodial manner; however, to achieve this, they inevitably required substantial changes to the fundamental parts of the protocol.
I hope that readers unfamiliar with these design choices and the philosophy behind them can understand these concerns and further reflect on their design approach. Bitcoin is a breathtaking creation, without a doubt, but it cannot distribute its benefits to a significant proportion of the population.
Segregated Witness and Taproot laid two cornerstones that were absolutely necessary to address Bitcoin's scalability shortcomings. Without these two proposals (or alternative protocol changes that addressed the same issues), all the growing scalability protocols and systems we have today would be nothing but a pipe dream.
Ark, Spark, BitVM, DLC — none of them are likely to appear.
This is the big picture. Bitcoin isn't perfect today, but it's starting to have a great opportunity to scale to a large enough population to have a real impact on the world and provide a genuine alternative for those trying to exit. This is all thanks to these two protocol upgrades: they removed those fundamental barriers.
(over)
footnote
1. https://github.com/bitcoin/bips/blob/master/bip-0141.mediawiki
2. https://github.com/bitcoin/bips/blob/master/bip-0340.mediawiki
3. https://github.com/bitcoin/bips/blob/master/bip-0341.mediawiki
4. https://github.com/bitcoin/bips/blob/master/bip-0342.mediawiki
5. https://github.com/bitcoin/bips/blob/master/bip-0327.mediawiki
6. https://github.com/siv2r/bip-frost-signing
7. https://github.com/bitcoin/bips/blob/master/bip-0114.mediawiki
8. https://github.com/bitcoin/bips/blob/master/bip-0016.mediawiki

