Towards a k-of-n multi-signature lightning node

This article is machine translated
Show original

Author: ZmnSCPxj

Source: https://delvingbitcoin.org/t/towards-ak-of-n-lightning-network-node/2395

introduction

Recently, with the publication of the paper " Nested MuSig2 ", we began to think: Can we create a multi-signature self-custody wallet that integrates the Lightning Network?

Specifically, what we hope to achieve is that, within the "Simple Taproot Channel" ( PR995 ), we can use "nested MuSig2" to create seamless multi-signature Lightning Network nodes.

But first, consider this: what exactly does an end user mean when they say, "I want a k-of-n multi-signature wallet"?

I believe the core needs of end users are as follows:

Regarding "k-of-n multi-signature," my wish is that if I own N devices, my funds are safe as long as less than K devices are hijacked. Even if I irrevocably lose some devices, I can still spend my funds as long as less than (N - K + 1) devices are involved.

How can we translate this aspiration into a "Lightning Network wallet" (in a real-world scenario)?

A wallet within a blockchain can directly express such k-of-n requirements using scripts, which is quite simple, but Lightning Network wallets cannot. Lightning Network channels have additional complexities. These additional complexities require changes to the Lightning Network BOLT protocol , not only for k-of-n Lightning Network wallets themselves, but also for their channel counterparts (which may or may not be k-of-n multi-signature).

This means that an end user who wants to use a k-of-n multisignature lightning wallet must:

  • Waiting for the modified Lightning Network BOLT protocol to gain widespread adoption, or
  • They accept services from gateway nodes (who implemented the modified Lightning Network BOLT protocol "early" at the cost of privacy and less revenue from forwarding fees).

A primary use case for K-of-n multi-signature Lightning Network nodes is to allow HODLers with substantial funds to use their savings to provide liquidity to the Lightning Network, ensuring their safety and resilience at all times.

(Translator's note: "HODLer" is a culturally significant term derived from "holder". It will be translated as "holder" throughout the following text.)

However, until the modified Lightning Network BOLT protocol is widely adopted, such large k-of-n multi-signature nodes can only connect to gateways, which then bridge the gap between the modified protocol and the network that has not yet been upgraded. Understandably, such gateway nodes would charge a fee, though the method of charging would vary. Any gateway node that doesn't charge a fee would find its available liquidity quickly depleted by such heavy holders; they themselves don't have that much liquidity unless they are also heavy holders; but if they are, then (based on our assumption) they themselves would want to use a multi-signature device, which would also require a gateway that supports the modified protocol.

Therefore, in order to support the application scenario we are talking about, the protocol must be modified "early" to drive the adoption of the modified protocol, so that those who hold large amounts of cryptocurrency can use their savings to provide liquidity.

Channel adversaries with minimal trust

Some might argue that "the Lightning Network is actually more secure because the channel adversary must also be compromised before the victim can lose their key."

This idea is also very wrong.

The correct thinking is: "Your opponent in the tunnel has taken the best position to sabotage you, so you should be extra worried about your opponent in the tunnel."

Specifically, imagine this scenario: If you're a heavy cryptocurrency holder looking to provide liquidity to the Lightning Network (in exchange for transaction fees), you'd want anyone to open a channel with you. Others opening channels with you means you have more opportunities to earn routing fees!

If you don't allow strangers on the internet to establish channels with your large routing nodes, and only allow a small number of trustworthy entities to establish channels with you, then I bet:

For one reason or another, these carefully selected "trustworthy" intermediary partners earn back the transaction fees you've earned in the form of liquidity rent; or you may find that your liquidity utilization rate is low.

The truth is, you have to allow strangers on the internet, like YaiJBOjA (just so you know, definitely not me!), to open channels with you. This also means you're allowing potential thieves to become your channel adversaries . You're allowing others to target you.

Therefore, you must assume that only your local signers(s) are trustworthy, thus eliminating the need to screen channel adversaries. Your security must be strong enough to deter thieves even if someone targets you from behind, and you should not rely on channel adversaries, as they are the best thieves .

Even if your adversary uses only their own funds to open a channel with you, they can still send their entire balance within the channel (through you) to another node they control, and then steal the funds from the channel (after they've used up their own balance, obviously the funds in the channel are yours). If they can break enough signature generators, or exploit vulnerabilities in your multi-signature scheme, they can steal your funds.

Therefore, your "Morton's Fork" looks like this:

  • If you don't prioritize the security of your signature generator, strangers on the internet could potentially steal your assets.
  • If you don't allow strangers on the internet to get close, then your liquidity will not be used effectively (first of all, you won't make money, and you'll be putting your money in a hot wallet on the internet without bringing any benefit to yourself or anyone else).

(Translator's note: "Morton's fork" refers to a dilemma where either way leads to certain death.)

Therefore, a better security posture is needed than "using a single root signer in all my Lightning Network channels".

Multi-signature is an additional dimension

You might say, "But my signer is placed in a 'Trusted Execution Environment (TEE)'; this execution environment runs in a 'Secure Chip (SE)'; and this Security Chip is implemented using a 'Physically Unforgeable Function (PUF)'!!!"

First and foremost, a PUF is simply a secure dice-rolling procedure used to generate a private key embedded within it. If you have access to the SE's debug interface—which every chip manufacturer inevitably provides because chip manufacturing is never a perfect process and there will always be imperfections —the PUF offers no protection whatsoever. The same debug interface, capable of detecting imperfections, can also be used to extract intermediate results from computations run using the embedded private key; after all, these intermediate computational circuits also need to be checked to verify that they were correctly manufactured. For practicality, you can't compute everything with a single circuit, so you need "flip-flops" to store these intermediate results (otherwise, you'd need a huge , therefore expensive and slow circuit, because the intermediate results stored in flip-flops are what allow the same general-purpose circuit, such as a multiplier or adder, to be reused in different parts of the computation). These flip-flops are also used as part of the debug interface (called a "scan chain," look it up yourself) at the manufacturer's facility to check for errors in the manufactured chips.

These flip-flops that store intermediate calculation results cannot be located in the PUF: because you want these flip-flops and the circuitry written into them to be exactly the same as your design, otherwise it would be an unreliable chip; and the PUF is the area where you deliberately increase its failure rate so that people cannot practically clone it!

Therefore, these intermediate calculations can be used to reduce the number of bits you have to guess when extracting the embedded private key.

Furthermore, it's worth noting that manufacturers have access to the aforementioned "manufacturer debug interface/scan chain," otherwise the term "manufacturer" wouldn't exist. Do you really believe that manufacturers lack the ability to read the private key embedded in the PUF you so revere? You're better off using a Trezor signer without a "security chip," because the private key embedded there is one you can actually generate by flipping a coin or rolling dice. You can use coins, dice, and other hardware you can genuinely control to perform operations you truly know and understand.

A “trusted enforcement environment” only “blocks your little sister”, not “blocks authoritarian governments.”

(Source: I have designed LCD (Liquid Crystal Display) display driver ASICs. Before that, I had reverse engineered ASICs; we used to grind down every layer of metal (literally) and photograph them under a microscope to understand the circuitry. Did you know that major Taiwanese chip manufacturers only accept electronic logic circuits designed using the top three Verilog simulator software (Mentor, Synaptic, and don't worry about the third one; if you really want to know, we used to work for Synaptic)? If you don't provide simulation results from one of them, they won't even consider your manufacturing request. They completely distrust Icarus Verilog; if you mention that name, they won't even reply.)

Therefore, possession of the hardware is always equivalent to possession of the private key, regardless of whether it's a "trusted execution environment," a "secure chip," or a "physically unforgeable function." It doesn't change the fact that "possession wins nine out of ten lawsuits." Once the hardware is released, the private key is also released.

Translator's Note: "possession is 9/10 of the law" literally means "possession is 9/10 of the law," referring to the high probability that the current possessor will win a lawsuit in ownership disputes. The author uses this metaphor to emphasize the importance of "possessing the hardware (signature generator)": since there is practically no mechanism to absolutely prevent an adversary from extracting the private key after obtaining your signature generator, the most important thing is to keep it securely in a safe place, preventing it from being obtained by an enemy.

Moreover, regardless of the circumstances: even if you have a "perfect" electronic device, such as a simple Trezor signer cobbled together from off-the-shelf parts (these parts are really everywhere and won't arouse suspicion because they are so common that no one would think of replacing your supply chain because there are so many of these old and boring microcontrollers), using a multi-signature scheme is still better for you: because thieves would need to steal multiple such electronic devices to steal your funds.

Multi-signature adds an extra dimension to your device, whether it's a "secure" device based on a "trusted execution environment" (where the vendor promises not to steal your private key) or a real device that is entirely under your control.

Multi-signature, as mentioned in the first part of this article's introduction, means "to make my funds insecure, I must compromise k devices" —provided we can achieve that.

Alternatives with fewer signatures

Besides requiring extensive changes to the Lightning Network, there are other ways to obtain some of the benefits of multi-signature.

For example, each Lightning channel has a dedicated pair of public keys, one at each end.

The Lightning Network protocol itself does not specify a particular scheme for deriving public keys for each channel from the "root" key (or from the node ID).

Therefore, it's possible to prepare N signers, and whenever a channel opens, select one of them to specify the public key used in that channel. Because the public key you use in the channel is independent of your node ID, you can freely choose any signer for each channel.

This is fully compatible with the current Lightning Network, requiring no protocol changes and not even waiting for PR995 to be merged.

In this approach, if you have N signers and one of them is hijacked, only 1/N of your channels will be compromised. This is still an improvement over having only one signer (leaving all channels at risk).

Here is an example of this approach: rotating-signer-provider .

This approach is readily deployable and practical because it at least diversifies risk. It's similar to using multiple nodes, except that it allows you to form a single large node with multiple independent signers managing funds. Generally, large nodes have advantages in routing efficiency and are favored by many nodes on the network, thus increasing your routing fee revenue.

BOLT Change: Revocation Key

To achieve true k-of-n multi-signature lightning nodes, the absolutely necessary change for BOLT is to remove the requirement to use a shachain to generate revocation keys for each committed transaction for the remote side.

Why is this necessary?

Recall what the actual on-chain contract looks like regarding the funds you promised to control through the exchange:

  • Choose one of the two:
    • All conditions are met:
      • The cover-up commitment deal has been thoroughly verified (usually up to two weeks).
      • Your signature
    • All conditions are met:
      • Opponent's signature
      • Revocation Key

The second part here is the key point.

Your commitment transaction is the only way you can recover your money. For example, a thief could use their own money to open a channel with you, send the balance in the channel to another node they control, and then shut down their own theft node; at this point, they have nothing left to lose (except for the 1% guarantee, but that can be considered the cost of the theft). You must use your commitment transaction; otherwise, your funds are not safe : they become unspendable.

What is "revocation"?

Many of the terms used in lightning channels can be confusing:

  • Penalty refers to the retaliation you can take when your opponent unilaterally closes a channel using the old state. This is contingent on your opponent broadcasting and confirming a commitment transaction, which has since been revoked .
  • Unilateral closure refers to broadcasting a state and asserting that it is the latest state of the channel. If you have previously revoked this state (meaning it is not actually the latest state, and therefore your assertion is wrong), it may be penalized .
  • Revoking a status means you agree that a status is obsolete. It gives the adversary enough information that if you use a revoked status to unilaterally close a channel, the adversary can punish you later.

So, at what time did these things happen?

  • Cancel :
    • This occurs whenever the state changes, that is, whenever an HTLC is added to or removed from the channel.
    • For example, your current state is numbered N. You and your opponent both agree to move to N + 1. The process is "alternating hands":
      • Your opponent's signature state is N+1.
        • At this point, both state N and state N+1 are valid, and both can be used to unilaterally close the channel without incurring penalties.
        • Using the analogy of "alternating hands," you are currently holding onto the rope with both hands, one hand above and one hand below, climbing upwards in this way—the same applies to advancing within a passageway.
      • You cancel the old state N.
        • This operation irreversibly locks you into state N+1, because it is now the only valid state.
        • Using the analogy of "alternating hands," it's like releasing the bottom hand during the undoing process.
    • At any given time, there can be at most two unrevoked states; most of the time, there is only one unrevoked state.
  • Unilateral closure :
    • This happens when you decide to sign a commitment deal—adding your own signature to your counterparty's signature, which you get during the normal alternating hands-on process—and broadcasting it.
    • There is no hard cryptographic mechanism to prevent you from using old state transactions (and their signatures), only one economic incentive: as long as you use an old commitment transaction (which you have already revoked), you will be penalized: you will lose all funds in the channel.
  • punish :
    • When you notice your opponent closing the channel with an old (undone) state.

The problem lies in the "cancel" operation, not in the "unilateral closure" or "penalty".

Reversal requires passing something called a "reversal key" to your adversary. The adversary can then use their own signature and this reversal key to punish the old state for unilateral closure.

The problem of K-of-N revocation keys

Now, think about it:

  • Suppose you have a k-of-n multisignature device.
  • How do you generate a revocation key for each committed transaction?
    • How many devices would your adversary need to compromise to steal the latest revocation key?

The big question behind this is: how exactly are revocation keys generated ?

The answer is: refer to the BOLT specification. The BOLT specification specifies that shachian should be used.

The problem is that, as the name suggests, shachain uses an iterative application of SHA2 ("sha") ("chain").

It's impossible to create a k-of-n SHA2 function. SHA2 is not linear at all, so it's impossible to create a k-of-n SHA2 function, even if k = n.

Even if it were possible to create a k-of-n SHA2 function through some mysterious cryptographic magic (involving virtual circuits that encode bits 0 and 1 into "homomorphic commitments"), holding an intermediate result of the shachain (i.e., the output of a SHA2 during the iteration process) would be just as bad, because it is also an intermediate result in the shachian iteration that becomes the revocation key.

Such virtual circuits are implemented through iteration, allowing intermediate results to be known by multiple participants in the virtual circuit; these virtual circuit schemes only protect the root input, not the intermediate results.

The problem is that the intermediate results of the shachain are the revocation keys! The virtual circuit uses homomorphic encryption of 0s and 1s to protect the root of the revocation key chain, which is irrelevant because the intermediate results of the iterations must be revealed; these results constitute the revocation key sequence. As a sequence, one of the keys is the revocation key for the latest state.

To protect these intermediate results, you need to unroll the entire shachain iteration and implement it as a huge virtual circuit. Even worse, every time you have to compute an intermediate result from the shachain, you have to unroll a different virtual circuit; and the number of such computations, according to the BOLT specification, is at least 2 (2 to the power of 1) times, and can be as many as 2 to the power of 48.

Therefore, a virtual circuit scheme involving multi-party computation may not be feasible.

(Warning: I am not a cryptographer. Please consult a cryptographer who truly understands homomorphic cryptography and multi-party computation regarding the above. Even so, I would not use multi-party computation in Shachain because I believe it would not work, and my cryptographic expertise is insufficient to find a feasible solution.)

Let's review again:

Regarding "k-of-n multi-signature", my wish is that if I own N devices, my funds will be safe as long as K devices are not hijacked.

If the root value of the shachain iteration is known to all signers, then if any one of them is compromised—and even one is compromised—the channel adversary can directly steal all the channel funds, thus breaking the aforementioned expectation.

If the root value of the shachain iteration is not known to all signers, but is distributed among k signers, then: which device will run repeated shachain iterations and provide the revocation key to the adversary? As long as this one device holding the root key (or even an intermediate state) is compromised, it is equivalent to holding the latest revocation key, which allows the theft of funds.

To reiterate, the significance of "k-of-n" lies in ensuring that a thief must compromise at least k devices. Even if only one device holds the revocation key, even temporarily, compromising that one device will allow theft. Therefore, ethically speaking, we cannot call this "k-of-n multi-signature" because it is not what end users expect when they say "k-of-n".

Therefore, shachain must be removed from the BOLT specification.

BOLT Modification Proposal

I propose adding a pair of " feature bits" called " no_more_shachains ", which can be either globalfeatures features or localfeatures features:

  • Odd bits: I will not perform shachain verification on the adversary, but I will still provide the adversary with a valid shachain revocation key.
    • This provides backward compatibility for nodes that still expect traditional shachain and do not upgrade.
    • This means I will store all the revocation keys instead of using the shachain acceleration structure to compress the storage space to the O(1) level.
    • This allows nodes that don't use ShaChain to open channels with me, and also allows me to bridge them and nodes that require ShaChain.
  • Even bits: I will not provide a valid shachian revocation key, nor will I verify the validity of the shachian revocation key provided by the adversary.

This globalfeatures bit allows k-of-n multi-signature nodes to identify other nodes with which they can safely establish channels. Note that "creating a BOLT8 connection" does not mean "creating a channel," but rather a k-of-n node can establish a BOLT8 connection, download the gossip graph, and then search globalfeatures for no_more_shachains .

Then, a k-of-n lightning node can enable even bits, while a gateway node can enable odd bits. Ultimately, we want all nodes to enable at least odd bits and for us to use even bits in between, even for 1-of-1 lightning nodes.

The above design respects the philosophy of "keeping odd bits," meaning traditional nodes can still interoperate with nodes that have enabled odd bits but not even bits. Nodes that have enabled odd bits can also interoperate with nodes that require even bits. k-of-n nodes require even bits to be enabled, but can interoperate with nodes that have enabled odd bits.

Please note that even enabling only odd bits will triple the storage volume of channels with longer histories, although channel splicing allows you to trim this data.

The reason why the space occupied has tripled is:

  • With anchored commitment transactions, the only form of state change is adding and removing HTLCs.
    • Therefore, each HTLC triggers two state changes.
  • Each historical HTLC is approximately 32 bytes of hash data.
  • Each state change requires an additional revocation key.
    • The shachain solution only requires a constant amount of storage space, but if we abandon it, we will need storage space that grows linearly.
    • Each revocation key is approximately 32 bytes of data (initially a public key, but after revocation, we can replace the public key with a private key).
  • Therefore, an HTLC will require 32 bytes to store the hash value, 32 bytes to store the revocation key for the commitment transaction that adds this HTLC, and another 32 bytes to store the revocation key for the commitment transaction that removes this HTLC.
    • Previously, using shachain, we only needed 32 bytes to store this hash value; therefore, the storage space for the final channel history would expand to 3 times .

Not a problem: MuSig2 Nonce notification

MuSig2 is a two-round protocol:

  • In the first round, the investment in R was swapped.
  • Then, exchange the fragment s .

Specifically, in the Taproot channel, the process of "exchanging fragment signatures s " is actually just one party sending a fragment signature s to the other party; the other party stores this fragment signature s on its hard drive. If the other party decides to unilaterally close the channel using this state, it will generate its own s , add the two together to generate the final signature R, s , and then broadcast the signature and transaction to the blockchain.

Nested MuSig2 is also largely the same two-round protocol, with nested signers completing each round internally, aggregating and combining their results, and then performing information exchange at the level above them.

Because there are two rounds, PR995 specifies when to use the current exchange s to send the nonce input for R in the next signing session. This reduces the number of communication round trips, which is important for our colleagues in Australia, considering the latency involved.

K-of-N's MuSig2?

While MuSig2 (and "nested MuSig2") are designed as n-of-n signature protocols, I would point out that the "k-of-n" FROST protocol is actually a verifiable Shamir secret partitioning scheme that uses many parts of the MuSig2 signature protocol in the actual signing process. That is, instead of using the MuSig/MuSig2 key combination function (unlike the MuSig2 signature protocol), FROST uses its own multi-party computation scheme to generate a set of "fragments" that you must store, each fragment corresponding to one of your co-signers, plus your own key fragment, and a collective public key.

When signing, FROST basically uses something very similar to MuSig2 (and therefore, from a mechanistic perspective, combining FROST and "nested MuSig2" should be possible; however, it cannot be guaranteed that the security proof of "nested MuSig2" can be extended to FROST-in-MuSig2!).

When signing, you need to find online signers (make k joint signers), and then all online signers generate the first round of input to R according to the MuSig2 signature scheme, exchange with each other, and then generate the second round of fragment signatures s .

The problem is that, in the PR995 proposal, every time we send a second round of fragment signatures s for the current signing session, we also need to send an input to R to prepare for the next signing session.

So, for example, suppose:

  • We have three signers, Alice, Bob, and Carol, forming a 2-of-3 Lightning node.
  • In the current signing session, Alice and Bob are online. They generate the inputs to R , process them, and then send the combined results to the adversary.
  • However, before the next signing session, Alice died from a stray bullet during the robot uprising that finally arrived.
  • Bob woke Carol up.
  • Carol is unaware of the secret nonce value used by Alice, and therefore cannot use the combined Alice+Bob nonce to sign.

Fortunately, while the PR995 proposal specifies that the counterparty should remember the nonce input until the next signature evaporates, this nonce will be discarded if the BOLT8 connection is lost. Upon reconnection, a new nonce can be sent by sending a channel_reestablish message.

Therefore, in the above scenario, Bob and Carol can ignore the robot uprising and continue signing: directly reconnect to the opposing side and then force the use of a new, pre-combined Bob+Carol nonce.

Therefore, nonce input rounds are not a problem for our k-of-n multi-signatures.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments