Understanding Bitcoin Miniscript (3): Parsing and Analysis

This article is machine translated
Show original

Author: benma

Source: https://shiftcrypto.ch/blog/understanding-bitcoin-miniscript-part-3/

This article is the third in the "Understanding Bitcoin Script" series. See the previous article here .

Understanding Bitcoin Miniscript - Part III

In the previous article in this series, we introduced what Miniscript is and how it maps to Bitcoin Script.

To understand how Miniscript works in detail, it's helpful to look at an example of its implementation -- including how to analyze code for correctness, how to create receiving addresses, and how to spend funds.

So, let's understand and write a Go language implementation.

We will always refer to https://bitcoin.sipa.be/miniscript/ as it contains details and features of all Miniscript pieces.

The top and bottom of each chapter include a link to the Go live runtime, where you can run the code, examine the results, and tinker with it.

In short, we will convert a piece of Miniscript into an abstract syntax tree (AST), then perform a series of tree transformations and tree traversals to perform correctness analysis, create corresponding Bitcoin Script, create payment addresses, etc. .

Disclaimer: The implementations below have not been reviewed or tested. Please do not use in production environment. It can only be used for teaching purposes.

The first step: convert into an abstract syntax tree

Click here to run the code in the Go live environment

Miniscript expressions are simple and easily converted into an AST . Unlike math/algebraic expressions, Miniscript expressions do not contain any infix operators, grouping parentheses, and parentheses are only used to enclose fragment parameters. Therefore, it is expressive and easy to parse.

Let's define the required AST:

 // AST 是用来表示一个Miniscript 表达式的抽象语法树。type AST struct { wrappers string identifier string args []*AST}

Identifiers like or_b will be stored in identifier field. If there are any wrappers, such as ascd:X , the wrappers are detached and stored in the wrappers field. Finally, the fragment's arguments will be stored recursively in args .

To turn an expression into a tree, we need the old and reliable stack data structure:

 type stack struct { elements []*AST}func (s *stack) push(element *AST) { s.elements = append(s.elements, element)}func (s *stack) pop() *AST { if len(s.elements) == 0 { return nil } top := s.elements[len(s.elements)-1] s.elements = s.elements[:len(s.elements)-1] return top}func (s *stack) top() *AST { if len(s.elements) == 0 { return nil } return s.elements[len(s.elements)-1]}func (s *stack) size() int { return len(s.elements)}

In order to convert an expression into a tree using a stack, we first split an expression into parts separated by parentheses and commas. Unfortunately, there is no threading function in the Go standard library, so I asked ChatGPT to write me a piece of code, and it worked:

 / - 使用ChatGPT 编写。// splitString 函数使用separator 作为分割元素单元,将一个字数串// 基于多种separator 分割成字符串切片。它也将// 把输出切片中的空元素移除。func splitString(s string, isSeparator func(c rune) bool) []string { // Create a slice to hold the substrings substrs := make([]string, 0) // Set the initial index to zero i := 0 // Iterate over the characters in the string for i < len(s) { // Find the index of the first separator in the string j := strings.IndexFunc(s[i:], isSeparator) if j == -1 { // If no separator was found, append the remaining substring and return substrs = append(substrs, s[i:]) return substrs } j += i // If a separator was found, append the substring before it if j > i { substrs = append(substrs, s[i:j]) } // Append the separator as a separate element substrs = append(substrs, s[j:j+1]) i = j + 1 } return substrs}

A quick unit test confirms that this code works:

 func TestSplitString(t *testing.T) { separators := func(c rune) bool { return c == '(' || c == ')' || c == ',' } require.Equal(t, []string{}, splitString("", separators)) require.Equal(t, []string{"0"}, splitString("0", separators)) require.Equal(t, []string{"0", ")", "(", "1", "("}, splitString("0)(1(", separators)) require.Equal(t, []string{"or_b", "(", "pk", "(", "key_1", ")", ",", "s:pk", "(", "key_2", ")", ")"}, splitString("or_b(pk(key_1),s:pk(key_2))", separators))}

We're ready to iterate over the fragments and parentheses/commas and build an expression tree.

Whenever we see an identifier (anything other than parentheses and commas), we push the identifier onto the stack and it will become the parent of all its child parameters. Whenever we encounter a comma or a back parenthesis, we know this indicates the end of the argument, so we pop the argument off the stack and add it to the parent node. Some invalid sequences are explicitly excluded, such as "()" and "(", which would not be valid miniscripts.

 func createAST(miniscript string) (*AST, error) { tokens := splitString(miniscript, func(c rune) bool { return c == '(' || c == ')' || c == ',' }) if len(tokens) > 0 { first, last := tokens[0], tokens[len(tokens)-1] if first == "(" || first == ")" || first == "," || last == "(" || last == "," { return nil, errors.New("invalid first or last character") } } // Build abstract syntax tree. var stack stack for i, token := range tokens { switch token { case "(": // Exclude invalid sequences, which cannot appear in valid miniscripts: "((", ")(", ",(". if i > 0 && (tokens[i-1] == "(" || tokens[i-1] == ")" || tokens[i-1] == ",") { return nil, fmt.Errorf("the sequence %s%s is invalid", tokens[i-1], token) } case ",", ")": // End of a function argument - take the argument and add it to the parent's argument // list. If there is no parent, the expression is unbalanced, eg `f(X))``. // // Exclude invalid sequences, which cannot appear in valid miniscripts: "(,", "()", ",,", ",)". if i > 0 && (tokens[i-1] == "(" || tokens[i-1] == ",") { return nil, fmt.Errorf("the sequence %s%s is invalid", tokens[i-1], token) } arg := stack.pop() parent := stack.top() if arg == nil || parent == nil { return nil, errors.New("unbalanced") } parent.args = append(parent.args, arg) default: if i > 0 && tokens[i-1] == ")" { return nil, fmt.Errorf("the sequence %s%s is invalid", tokens[i-1], token) } // Split wrappers from identifier if they exist, eg in "dv:older", "dv" are wrappers // and "older" is the identifier. wrappers, identifier, found := strings.Cut(token, ":") if !found { // No colon => Cut returns `identifier, ""`, not `"", identifier"`. wrappers, identifier = identifier, wrappers } else if wrappers == "" { return nil, fmt.Errorf("no wrappers found before colon before identifier: %s", identifier) } else if identifier == "" { return nil, fmt.Errorf("no identifier found after colon after wrappers: %s", wrappers) } stack.push(&AST{wrappers: wrappers, identifier: identifier}) } } if stack.size() != 1 { return nil, errors.New("unbalanced") } return stack.top(), nil}Let's also add a function to draw the tree, so we can visualize it more easily:func (a *AST) drawTree(w io.Writer, indent string) { if a.wrappers != "" { fmt.Fprintf(w, "%s:", a.wrappers) } fmt.Fprint(w, a.identifier) fmt.Fprintln(w) for i, arg := range a.args { mark := "" delim := "" if i == len(a.args)-1 { mark = "└──" } else { mark = "├──" delim = "|" } fmt.Fprintf(w, "%s%s", indent, mark) arg.drawTree(w, indent+delim+strings.Repeat(" ", len([]rune(arg.identifier))+len([]rune(mark))-1-len(delim))) }}func (a *AST) DrawTree() string { var b strings.Builder a.drawTree(&b, "") return b.String()}

Let's try it with a complex expression:

 func main() { node, err := createAST("andor(pk(key_remote),or_i(and_v(v:pkh(key_local),hash160(H)),older(1008)),pk(key_revocation))") if err != nil { panic(err) } fmt.Println(node.DrawTree())}

success! The output is this:

 andor├──pk| └──key_remote├──or_i| ├──and_v| | ├──v:pkh| | | └──key_local| | └──hash160| | └──H| └──older| └──1008└──pk └──key_revocation

Of course, the parser hasn't checked it yet, so expressions like unknownFragment(foo,bar) can also be converted to AST:

 unknownFragment├──foo└──bar

Click here to run the code in the live Go environment

Step Two: Check Fragment and Parameter Numbers

Click here to run the code in the live Go environment

As the first step in multiple tree traversals, we do a first easy check: is every fragment identifier on the tree valid? Each has the correct parameter number?

According to the specification , all fragments are arranged as follows:

 const ( // 所有的片段标识符f_0 = "0" // 0 f_1 = "1" // 1 f_pk_k = "pk_k" // pk_k(key) f_pk_h = "pk_h" // pk_h(key) f_pk = "pk" // pk(key) = c:pk_k(key) f_pkh = "pkh" // pkh(key) = c:pk_h(key) f_sha256 = "sha256" // sha256(h) f_ripemd160 = "ripemd160" // ripemd160(h) f_hash256 = "hash256" // hash256(h) f_hash160 = "hash160" // hash160(h) f_older = "older" // older(n) f_after = "after" // after(n) f_andor = "andor" // andor(X,Y,Z) f_and_v = "and_v" // and_v(X,Y) f_and_b = "and_b" // and_b(X,Y) f_and_n = "and_n" // and_n(X,Y) = andor(X,Y,0) f_or_b = "or_b" // or_b(X,Z) f_or_c = "or_c" // or_c(X,Z) f_or_d = "or_d" // or_d(X,Z) f_or_i = "or_i" // or_i(X,Z) f_thresh = "thresh" // thresh(k,X1,...,Xn) f_multi = "multi" // multi(k,key1,...,keyn))

The first arguments of older , after , thresh and multi are all numbers. When we need to parse it, check if it's a valid number, we'll convert it to a number and store it in our AST for later use. Therefore, we add a new field to the AST:

 // AST is the abstract syntax tree representing a Miniscript expression.type AST struct { wrappers string identifier string // 在标识符预计是个数字时解析出来的整数。 // 比如older/after/multi/thresh 的第一个参数。否则不使用num uint64 args []*AST}

We also need a function that recursively walks the tree and applies a function to each Miniscript expression/subexpression. This transformation function can modify a node, or directly replace it with a new node, which is useful for the final stage of parsing:

 func (a *AST) apply(f func(*AST) (*AST, error)) (*AST, error) { for i, arg := range a.args { // 我们并不递归进入不是Miniscript 表达式的参数: // key/hash 标量以及older/after/multi/thresh 的数值参数。 switch a.identifier { case f_pk_k, f_pk_h, f_pk, f_pkh, f_sha256, f_hash256, f_ripemd160, f_hash160, f_older, f_after, f_multi: // 这些函数的变量都不是Miniscript 表达式,只是// 变量(或者说具体的指定)或者数字。 continue case f_thresh: // 第一个参数是一个数字,其它的参数是子表达式, // 就是我们想要遍历的东西,所以我们只跳过第一个参数。 if i == 0 { continue } } new, err := arg.apply(f) if err != nil { return nil, err } a.args[i] = new } return f(a)}

case:

 node, _ := createAST("andor(pk(key_remote),or_i(and_v(v:pkh(key_local),hash160(H)),older(1008)),pk(key_revocation))")node.apply(func(node *AST) (*AST, error) { fmt.Println("Visiting node:", node.identifier) return node, nil })

output:

 Visiting node: pkVisiting node: pkhVisiting node: hash160Visiting node: and_vVisiting node: olderVisiting node: or_iVisiting node: pkVisiting node: andor

Now, we add a Parse function that creates the AST and applies transformation functions successively, the first of which is the fragment and parameter inspector:

 func Parse(miniscript string) (*AST, error) { node, err := createAST(miniscript) if err != nil { return nil, err } for _, transform := range []func(*AST) (*AST, error){ argCheck, // More stages to come } { node, err = node.apply(transform) if err != nil { return nil, err } } return node, nil}

The argCheck function will be used for each node of the tree, and we can simply enumerate all valid fragment identifiers to perform this basic check:

 // argCheck 会检查每一个标识符都是一种已知的Miniscript 标识符,并且// 具有正确数量的参数,例如`andor(X,Y,Z)` 必须有三个参数,等等。func argCheck(node *AST) (*AST, error) { // Helper function to check that this node has a specific number of arguments. expectArgs := func(num int) error { if len(node.args) != num { return fmt.Errorf("%s expects %d arguments, got %d", node.identifier, num, len(node.args)) } return nil } switch node.identifier { case f_0, f_1: if err := expectArgs(0); err != nil { return nil, err } case f_pk_k, f_pk_h, f_pk, f_pkh, f_sha256, f_ripemd160, f_hash256, f_hash160: if err := expectArgs(1); err != nil { return nil, err } if len(node.args[0].args) > 0 { return nil, fmt.Errorf("argument of %s must not contain subexpressions", node.identifier) } case f_older, f_after: if err := expectArgs(1); err != nil { return nil, err } _n := node.args[0] if len(_n.args) > 0 { return nil, fmt.Errorf("argument of %s must not contain subexpressions", node.identifier) } n, err := strconv.ParseUint(_n.identifier, 10, 64) if err != nil { return nil, fmt.Errorf( "%s(k) => k must be an unsigned integer, but got: %s", node.identifier, _n.identifier) } _n.num = n if n < 1 || n >= (1<<31) { return nil, fmt.Errorf("%s(n) -> n must 1 ≤ n < 2^31, but got: %s", node.identifier, _n.identifier) } case f_andor: if err := expectArgs(3); err != nil { return nil, err } case f_and_v, f_and_b, f_and_n, f_or_b, f_or_c, f_or_d, f_or_i: if err := expectArgs(2); err != nil { return nil, err } case f_thresh, f_multi: if len(node.args) < 2 { return nil, fmt.Errorf("%s must have at least two arguments", node.identifier) } _k := node.args[0] if len(_k.args) > 0 { return nil, fmt.Errorf("argument of %s must not contain subexpressions", node.identifier) } k, err := strconv.ParseUint(_k.identifier, 10, 64) if err != nil { return nil, fmt.Errorf( "%s(k, ...) => k must be an integer, but got: %s", node.identifier, _k.identifier) } _k.num = k numSubs := len(node.args) - 1 if k < 1 || k > uint64(numSubs) { return nil, fmt.Errorf( "%s(k) -> k must 1 ≤ k ≤ n, but got: %s", node.identifier, _k.identifier) } if node.identifier == f_multi { // 一个multisig 可以拥有的最大公钥数量。 const multisigMaxKeys = 20 if numSubs > multisigMaxKeys { return nil, fmt.Errorf("number of multisig keys cannot exceed %d", multisigMaxKeys) } // Multisig 的密钥是一种变量,无法拥有子表达式。 for _, arg := range node.args { if len(arg.args) > 0 { return nil, fmt.Errorf("arguments of %s must not contain subexpressions", node.identifier) } } } default: return nil, fmt.Errorf("unrecognized identifier: %s", node.identifier) } return node, nil}

With these checks, we have been able to rule out most of the invalid Miniscript. Let's look at some cases:

 func main() { for _, expr := range []string{ "invalid", "pk(key1,tooManyArgs)", "pk(key1(0))", "and_v(0)", "after(notANumber)", "after(-1)", "multi(0,k1)", "multi(2,k1)", "multi(1,k1,k2,k3,k4,k5,k6,k7,k8,k9,k10,k11,k12,k13,k14,k15,k16,k17,k18,k19,k20,k21)", } { _, err := Parse(expr) fmt.Println(expr, " -- ", err) }}

output:

 invalid -- unrecognized identifier: invalidpk(key1,tooManyArgs) -- pk expects 1 arguments, got 2pk(key1(0)) -- argument of pk must not contain subexpressionsand_v(0) -- and_v expects 2 arguments, got 1after(notANumber) -- after(k) => k must be an unsigned integer, but got: notANumberafter(-1) -- after(k) => k must be an unsigned integer, but got: -1multi(0,k1) -- multi(k) -> k must 1 ≤ k ≤ n, but got: 0multi(2,k1) -- multi(k) -> k must 1 ≤ k ≤ n, but got: 2multi(1,k1,k2,k3,k4,k5,k6,k7,k8,k9,k10,k11,k12,k13,k14,k15,k16,k17,k18,k19,k20,k21) -- number of multisig keys cannot exceed 20

Click here to run the code in the live Go environment

Step 3: Expand the wrapper

Click here to run the code in the live Go environment

Each fragment can be wrapped by a wrapper, which is indicated by a colon ":". Example from here .

dv:older(144) indicates that the d: wrapper is applied to the v: wrapper, and the v: wrapper is used to wrap the older segment whose parameter is 144 blocks.

In the next stage of the parser, we want to also operate on wrappers, because they behave like normal fragments: they can be mapped to Bitcoin Script, have their own correctness rules, etc. In a word, dv:older(144) is just syntactic sugar for d(v(older(144))) .

In this case, we want to convert the AST from this:

 dv:older└──144

becomes like this:

 d└──v └──older └──144

To perform this conversion, we need to add this function to the conversion list. Note that we iterate over the letters in the wrapper in reverse order, since they are used from right to left.

 // expandWrappers 应用在封装器(引号以前的字母)上,例如`ascd:X` =>// `a(s(c(d(X))))`.func expandWrappers(node *AST) (*AST, error) { const allWrappers = "asctdvjnlu" wrappers := []rune(node.wrappers) node.wrappers = "" for i := len(wrappers) - 1; i >= 0; i-- { wrapper := wrappers[i] if !strings.ContainsRune(allWrappers, wrapper) { return nil, fmt.Errorf("unknown wrapper: %s", string(wrapper)) } node = &AST{identifier: string(wrapper), args: []*AST{node}} } return node, nil}

Click here to run the code in the live Go environment

Step 4: Unwrap the icing

Click here to run the code in the live Go environment

Miniscript defines 6 syntactic sugars. If a Miniscript snippet contains expressions for the coordinates of the following equations, then these expressions can be replaced by the expressions to the right of the equals sign. To save the effort of dealing with these 6 fragments at a later stage, we add a desugaring transformation function instead of these expressions.

The substitution relationship is as follows:

 pk(key) = c:pk_k(key)pkh(key) = c:pk_h(key)and_n(X,Y) = andor(X,Y,0)t:X = and_v(X,1)l:X = or_i(0,X)u:X = or_i(X,0)

Now, we can just augment this list with the identifiers we defined initially in the wrapper fragment:

 const ( // [...] f_wrap_a = "a" // a:X f_wrap_s = "s" // s:X f_wrap_c = "c" // c:X f_wrap_d = "d" // d:X f_wrap_v = "v" // v:X f_wrap_j = "j" // j:X f_wrap_n = "n" // n:X f_wrap_t = "t" // t:X = and_v(X,1) f_wrap_l = "l" // l:X = or_i(0,X) f_wrap_u = "u" // u:X = or_i(X,0)))

The conversion function will become like this:

 // desugar 使用最终的形式替换了语法糖func desugar(node *AST) (*AST, error) { switch node.identifier { case f_pk: // pk(key) = c:pk_k(key) return &AST{ identifier: f_wrap_c, args: []*AST{ { identifier: f_pk_k, args: node.args, }, }, }, nil case f_pkh: // pkh(key) = c:pk_h(key) return &AST{ identifier: f_wrap_c, args: []*AST{ { identifier: f_pk_h, args: node.args, }, }, }, nil case f_and_n: // and_n(X,Y) = andor(X,Y,0) return &AST{ identifier: f_andor, args: []*AST{ node.args[0], node.args[1], {identifier: f_0}, }, }, nil case f_wrap_t: // t:X = and_v(X,1) return &AST{ identifier: f_and_v, args: []*AST{ node.args[0], {identifier: f_1}, }, }, nil case f_wrap_l: // l:X = or_i(0,X) return &AST{ identifier: f_or_i, args: []*AST{ {identifier: f_0}, node.args[0], }, }, nil case f_wrap_u: // u:X = or_i(X,0) return &AST{ identifier: f_or_i, args: []*AST{ node.args[0], {identifier: f_0}, }, }, nil } return node, nil}

We try all of these methods, and apply visual inspection:

 func main() { for _, expr := range []string{ "pk(key)", "pkh(key)", "and_n(pk(key),sha256(H))", "tv:pk(key)", "l:pk(key)", "u:pk(key)", } { node, err := Parse(expr) if err != nil { panic(err) } fmt.Printf("Tree for \"%v\"\n", expr) fmt.Println(node.DrawTree()) }}

As you can see from the output below, the desugaring function worked:

 Tree for "pk(key)"c└──pk_k └──keyTree for "pkh(key)"c└──pk_h └──keyTree for "and_n(pk(key),sha256(H))"andor├──c| └──pk_k| └──key├──sha256| └──H└──0Tree for "tv:pk(key)"and_v├──v| └──c| └──pk_k| └──key└──1Tree for "l:pk(key)"or_i├──0└──c └──pk_k └──keyTree for "u:pk(key)"or_i├──c| └──pk_k| └──key└──0

Click here to run the code in the live Go environment

Step 5: Type checking

Click here to run the code in the live Go environment

Not all pieces can be combined with each other: some combinations do not produce valid Bitcoin script and valid witness data.

However, because Miniscript expressions and fragments are sufficiently structured and hierarchical, it is easy to statically analyze whether a piece of Miniscript expression is valid under all conditions.

For example, or_b(pk(key1),pk(key2)) and or_b(v:pk(key1),v:pk(key2)) are not valid combinations, but or_b(pk(key1),s:pk(key2)) is valid.

According to the Miniscript specification , each fragment may be one of the four basic types B , V , K , and W ; each fragment may have additional type properties ( z , o , n , d , and u ).

Component fragments (fragments that cannot contain any subexpressions) have a fixed base type and fixed type properties. For example, the hash function fragment sha256(h) will be converted into Bitcoin script SIZE <32> EQUALVERIFY SHA256 <h> EQUAL , which can be satisfied by using the witness data <32 byte preimage> (the value in it satisfies sha256(preimage)=h ), which is of type Bondu , meaning:

  • B : On success, push a non-zero value onto the stack; on failure, push an exact 0 value. Consume the top element of the stack (if any).
  • o : consumes a combat element (in this case that preimage)
  • n : non-zero attribute - cannot be satisfied with 0. The correct preimage for sha256(h) must be 32 bytes long, so it cannot be 0.
  • d : Can be avoided unconditionally. In the sha256() example, any data that is 32 bytes but not a true preimage is invalid, which can always be constructed. Note that non-32-byte values are not valid escapes, because they cause script execution to terminate on EQUALVERIFY instead of continuing.
  • u : When satisfied, push 1 to the stack.

The basic types and type properties are carefully defined to ensure the correctness of the assembled fragments. These properties can be assigned to each fragment based on inference from scripts and witness data. Similarly, for fragments that reside in subexpressions, such as and_b(X,Y) , you can analyze what types X and Y must satisfy, and what derived types and attributes and_b(X,Y) itself must have. Fortunately, the authors of Miniscript have already done this part of the work, and have documented the correctness table in the specification .

The top-level fragment must have type B , otherwise the fragment is invalid.

We augment our AST with basic types and type properties:

 type basicType stringconst ( typeB basicType = "B" typeV basicType = "V" typeK basicType = "K" typeW basicType = "W")type properties struct { // Basic type properties z, o, n, d, u bool}func (p properties) String() string { s := strings.Builder{} if pz { s.WriteRune('z') } if po { s.WriteRune('o') } if pn { s.WriteRune('n') } if pd { s.WriteRune('d') } if pu { s.WriteRune('u') } return s.String()}// AST is the abstract syntax tree representing a Miniscript expression.type AST struct { basicType basicType props properties wrappers string identifier string // Parsed integer for when identifer is a expected to be a number, ie the first argument of // older/after/multi/thresh. Otherwise unused. num uint64 args []*AST}// typeRepr returns the basic type (B, V, K or W) followed by all type properties.func (a *AST) typeRepr() string { return fmt.Sprintf("%s%s", a.basicType, a.props)}

Then, we add another function to traverse the entire tree. This function will check the type requirements of the subexpression and set the type and type properties according to the correctness table of the specification.

Since this function is quite long, we'll just show a shortened version of it that handles a few pieces, just to demonstrate how it works. Fragment types are related to both components and parameters. It directly encodes type rules according to the tables in the specification. For example, for s:X to be valid, X must be of type Bo ; whereas the entire fragment will have type W and get d and u properties of X

You can see and run a full version that handles each fragment in an online environment .

 // expectBasicType is a helper function to check that this node has a specific type.func (a *AST) expectBasicType(typ basicType) error { if a.basicType != typ { return fmt.Errorf("expression `%s` expected to have type %s, but is type %s", a.identifier, typ, a.basicType) } return nil}func typeCheck(node *AST) (*AST, error) { switch node.identifier { case f_0: node.basicType = typeB node.props.z = true node.props.u = true node.props.d = true // [...] case f_pk_k: node.basicType = typeK node.props.o = true node.props.n = true node.props.d = true node.props.u = true // [...] case f_or_d: _x, _z := node.args[0], node.args[1] if err := _x.expectBasicType(typeB); err != nil { return nil, err } if !_x.props.d || !_x.props.u { return nil, fmt.Errorf( "wrong properties on `%s`, the first argument of `%s`", _x.identifier, node.identifier) } if err := _z.expectBasicType(typeB); err != nil { return nil, err } node.basicType = typeB node.props.z = _x.props.z && _z.props.z node.props.o = _x.props.o && _z.props.z node.props.d = _z.props.d node.props.u = _z.props.u // [...] case f_wrap_s: _x := node.args[0] if err := _x.expectBasicType(typeB); err != nil { return nil, err } if !_x.props.o { return nil, fmt.Errorf( "wrong properties on `%s`, the first argument of `%s`", _x.identifier, node.identifier) } node.props.d = _x.props.d node.props.u = _x.props.u // [...] } return node, nil}

Now that we've deduced all types and type properties, we also need to add the check that the top-level expression must have type B to the final check:

 func Parse(miniscript string) (*AST, error) { node, err := createAST(miniscript) if err != nil { return nil, err } for _, transform := range []func(*AST) (*AST, error){ argCheck, expandWrappers, desugar, typeCheck, // More stages to come } { node, err = node.apply(transform) if err != nil { return nil, err } } // Top-level expression must be of type "B". if err := node.expectBasicType(typeB); err != nil { return nil, err } return node, nil}

Let's check with valid and invalid Miniscript snippets:

 func main() { expr := "or_b(pk(key1),s:pk(key2))" node, err := Parse(expr) if err == nil { fmt.Println("miniscript valid:", expr) fmt.Println(node.DrawTree()) } for _, expr := range []string{"pk_k(key)", "or_b(pk(key1),pk(key2))"} { _, err = Parse(expr) fmt.Println("miniscript invalid:", expr, "-", err) }}

success! The output is this:

 miniscript valid: or_b(pk(key1),s:pk(key2))or_b [Bdu]├──c [Bondu]| └──pk_k [Kondu]| └──key1└──s [Wdu] └──c [Bondu] └──pk_k [Kondu] └──key2miniscript invalid: pk_k(key) - expression `pk_k` expected to have type B, but is type Kminiscript invalid: or_b(pk(key1),pk(key2)) - expression `c` expected to have type W, but is type B

(We modified draw() function to display its type next to each fragment.)

Click here to run the code in the live Go environment

Step 6: Generate Bitcoin Script

We haven't yet implemented a check that rejects all invalid Miniscript fragments, but for now, it's time to experiment with using it to generate Bitcoin Script.

The conversion tables in the specification define how Miniscript fragments map to Bitcoin Script. For example, and_b(X, Y) maps to [X] [Y] BOOLAND , etc.

We'll start by making a function that creates a string representation of the script that can be read directly, like that conversion table. This makes it easier for us to prototype and eliminate bugs because you can easily check the output. The actual script is a string of bytes, which we'll implement later.

We add this function to map each fragment into a script according to the conversion table:

 func scriptStr(node *AST) string { switch node.identifier { case f_0, f_1: return node.identifier case f_pk_k: return fmt.Sprintf("<%s>", node.args[0].identifier) case f_pk_h: return fmt.Sprintf("DUP HASH160 <HASH160(%s)> EQUALVERIFY", node.args[0].identifier) case f_older: return fmt.Sprintf("<%s> CHECKSEQUENCEVERIFY", node.args[0].identifier) case f_after: return fmt.Sprintf("<%s> CHECKLOCKTIMEVERIFY", node.args[0].identifier) case f_sha256, f_hash256, f_ripemd160, f_hash160: return fmt.Sprintf( "SIZE <32> EQUALVERIFY %s <%s> EQUAL", strings.ToUpper(node.identifier), node.args[0].identifier) case f_andor: return fmt.Sprintf("%s NOTIF %s ELSE %s ENDIF", scriptStr(node.args[0]), scriptStr(node.args[2]), scriptStr(node.args[1]), ) case f_and_v: return fmt.Sprintf("%s %s", scriptStr(node.args[0]), scriptStr(node.args[1])) case f_and_b: return fmt.Sprintf("%s %s BOOLAND", scriptStr(node.args[0]), scriptStr(node.args[1]), ) case f_or_b: return fmt.Sprintf("%s %s BOOLOR", scriptStr(node.args[0]), scriptStr(node.args[1]), ) case f_or_c: return fmt.Sprintf("%s NOTIF %s ENDIF", scriptStr(node.args[0]), scriptStr(node.args[1]), ) case f_or_d: return fmt.Sprintf("%s IFDUP NOTIF %s ENDIF", scriptStr(node.args[0]), scriptStr(node.args[1]), ) case f_or_i: return fmt.Sprintf("IF %s ELSE %s ENDIF", scriptStr(node.args[0]), scriptStr(node.args[1]), ) case f_thresh: s := []string{} for i := 1; i < len(node.args); i++ { s = append(s, scriptStr(node.args[i])) if i > 1 { s = append(s, "ADD") } } s = append(s, node.args[0].identifier) s = append(s, "EQUAL") return strings.Join(s, " ") case f_multi: s := []string{node.args[0].identifier} for _, arg := range node.args[1:] { s = append(s, fmt.Sprintf("<%s>", arg.identifier)) } s = append(s, fmt.Sprint(len(node.args)-1)) s = append(s, "CHECKMULTISIG") return strings.Join(s, " ") case f_wrap_a: return fmt.Sprintf("TOALTSTACK %s FROMALTSTACK", scriptStr(node.args[0])) case f_wrap_s: return fmt.Sprintf("SWAP %s", scriptStr(node.args[0])) case f_wrap_c: return fmt.Sprintf("%s CHECKSIG", scriptStr(node.args[0])) case f_wrap_d: return fmt.Sprintf("DUP IF %s ENDIF", scriptStr(node.args[0])) case f_wrap_v: return fmt.Sprintf("%s VERIFY", scriptStr(node.args[0])) case f_wrap_j: return fmt.Sprintf("SIZE 0NOTEQUAL IF %s ENDIF", scriptStr(node.args[0])) case f_wrap_n: return fmt.Sprintf("%s 0NOTEQUAL", scriptStr(node.args[0])) default: return "<unknown>" }}

try:

Run in an online environment

 func main() { node, err := Parse("or_d(pk(pubkey1),and_v(v:pk(pubkey2),older(52560)))") if err != nil { panic(err) } fmt.Println(scriptStr(node))}

output:

 <pubkey1> CHECKSIG IFDUP NOTIF <pubkey2> CHECKSIG VERIFY <52560> CHECKSEQUENCEVERIFY ENDIF

This is correct, but there is still one thing to optimize. v:X wrapper maps to [X] VERIFY . The operators EQUALVERIFY , CHECKSIGVERIFY , and CHECKMULTISIGVERIFY are shorthand for EQUAL VERIFY , CHECKSIG VERIFY , and CHECKMULTISIG VERIFY respectively, so in the above script, CHECKSIG VERIFY should be abbreviated to CHECKSIGVERITY to save a byte in the script.

If in v:X , the last operator of [X] is EQUAL / CHECKSIG / CHECKMULTISIG , it can be replaced with the VERITY version.

Since X can be any expression, we need another tree traversal to determine if the last operator of each fragment is one of the three above.

We add this attribute to the attribute structure:

 type properties struct { // Basic type properties z, o, n, d, u bool // Check if the rightmost script byte produced by this node is OP_EQUAL, OP_CHECKSIG or // OP_CHECKMULTISIG. // // If so, it can be be converted into the VERIFY version if an ancestor is the verify wrapper // `v`, ie OP_EQUALVERIFY, OP_CHECKSIGVERIFY and OP_CHECKMULTISIGVERIFY instead of using two // opcodes, eg `OP_EQUAL OP_VERIFY`. canCollapseVerify bool}

Also, this function sets this field for each fragment, which we need to add to the list of conversion functions:

 func canCollapseVerify(node *AST) (*AST, error) { switch node.identifier { case f_sha256, f_ripemd160, f_hash256, f_hash160, f_thresh, f_multi, f_wrap_c: node.props.canCollapseVerify = true case f_and_v: node.props.canCollapseVerify = node.args[1].props.canCollapseVerify case f_wrap_s: node.props.canCollapseVerify = node.args[0].props.canCollapseVerify } return node, nil}

and_v fragments and s- wrappers are the only composable fragments that can end with a subexpression: and_v(X,Y) => [X] [Y] and s:X => SWAP [X] , so, They get properties directly from child nodes. Hash fragments and thresh / multi / c in scripts will all end up with EQUAL / CHECKSIG / CHECKMULTISIG , e.g. c:X => [X] CHECKSIG . These are candidates for VERIFY versions that will be factored into these operators.

We can then modify our scriptStr function to use the VERIFY version of the operator when allowed. For brevity, we only show two cases below. You can see and run the full version at this site .

 // collapseVerify is true if the `v` wrapper (VERIFY wrapper) is an ancestor of the node. If so, the// two opcodes `OP_CHECKSIG VERIFY` can be collapsed into one opcode `OP_CHECKSIGVERIFY` (same for// OP_EQUAL and OP_CHECKMULTISIG).func scriptStr(node *AST, collapseVerify bool) string { switch node.identifier { // [...] case f_wrap_c: opVerify := "CHECKSIG" if node.props.canCollapseVerify && collapseVerify { opVerify = "CHECKSIGVERIFY" } return fmt.Sprintf("%s %s", scriptStr(node.args[0], collapseVerify), opVerify, ) // [...] case f_wrap_v: s := scriptStr(node.args[0], true) if !node.args[0].props.canCollapseVerify { s += " VERIFY" } return s}

Running the program with the modified function, you get this output:

 <pubkey1> CHECKSIG IFDUP NOTIF <pubkey2> CHECKSIGVERIFY <52560> CHECKSEQUENCEVERIFY ENDIF

Successfully reduced CHECKSIG VERITY to CHECKSIGVERIFY .

Step 7: Generate payment address

Click here to run the code in the live Go environment

In the previous inoculation, we created a readable representation for Bitcoin Script. In order to generate a P2WSH address, we need to develop the actual script as a sequence of bytes, which is then encoded into an address.

To do this, we first replace the public key and hash variable with the real public key and hash. We add a new field to the AST: value .

 type AST struct { // [...] identifier string // For key arguments, this will be the 33 bytes compressed pubkey. // For hash arguments, this will be the 32 bytes (sha256, hash256) or 20 bytes (ripemd160, hash160) hash. value []byte args []*AST}

Now, we can add a new function ApplyVars , which will replace all variables in the Miniscript expression with real values. The caller can provide a callback function to provide these values.

Miniscript also specifies that the public key is not duplicated (which simplifies the analysis of the script), so we check for duplicates.

 // ApplyVars replaces key and hash values in the miniscript.//// The callback should return `nil, nil` if the variable is unknown. In this case, the identifier// itself will be parsed as the value (hex-encoded pubkey, hex-encoded hash value).func (a *AST) ApplyVars(lookupVar func(identifier string) ([]byte, error)) error { // Set of all pubkeys to check for duplicates allPubKeys := map[string]struct{}{} _, err := a.apply(func(node *AST) (*AST, error) { switch node.identifier { case f_pk_k, f_pk_h, f_multi: var keyArgs []*AST if node.identifier == f_multi { keyArgs = node.args[1:] } else { keyArgs = node.args[:1] } for _, arg := range keyArgs { key, err := lookupVar(arg.identifier) if err != nil { return nil, err } if key == nil { // If the key was not a variable, assume it's the key value directly encoded as // hex. key, err = hex.DecodeString(arg.identifier) if err != nil { return nil, err } } if len(key) != pubKeyLen { return nil, fmt.Errorf("pubkey argument of %s expected to be of size %d, but got %d", node.identifier, pubKeyLen, len(key)) } pubKeyHex := hex.EncodeToString(key) if _, ok := allPubKeys[pubKeyHex]; ok { return nil, fmt.Errorf( "duplicate key found at %s (key=%s, arg identifier=%s)", node.identifier, pubKeyHex, arg.identifier) } allPubKeys[pubKeyHex] = struct{}{} arg.value = key } case f_sha256, f_hash256, f_ripemd160, f_hash160: arg := node.args[0] hashLen := map[string]int{ f_sha256: 32, f_hash256: 32, f_ripemd160: 20, f_hash160: 20, }[node.identifier] hashValue, err := lookupVar(arg.identifier) if err != nil { return nil, err } if hashValue == nil { // If the hash value was not a variable, assume it's the hash value directly encoded // as hex. hashValue, err = hex.DecodeString(node.args[0].identifier) if err != nil { return nil, err } } if len(hashValue) != hashLen { return nil, fmt.Errorf("%s len must be %d, got %d", node.identifier, hashLen, len(hashValue)) } arg.value = hashValue } return node, nil }) return err}

See it in action:

 func main() { node, err := Parse("or_d(pk(pubkey1),and_v(v:pk(pubkey2),older(52560)))") if err != nil { panic(err) } unhex := func(s string) []byte { b, _ := hex.DecodeString(s) return b } // Two arbitrary pubkeys. _, pubKey1 := btcec.PrivKeyFromBytes( unhex("2c3931f593f26037a8b8bf837363831b18bbfb91a712dd9d862db5b9b06dc5df")) _, pubKey2 := btcec.PrivKeyFromBytes( unhex("f902f94da618721e516d0a2a2666e2ec37079aaa184ee5a2c00c835c5121b3eb")) err = node.ApplyVars(func(identifier string) ([]byte, error) { switch identifier { case "pubkey1": return pubKey1.SerializeCompressed(), nil case "pubkey2": return pubKey2.SerializeCompressed(), nil } return nil, nil }) if err != nil { panic(err) } fmt.Println(node.DrawTree())}

Output, the public key was successfully replaced:

 or_d [B]├──c [Bonduv]| └──pk_k [Kondu]| └──pubkey1 [03469d685c3445e83ee6e3cfb30382795c249c91955523c25f484d69379c7a7d6f]└──and_v [Bon] ├──v [Von] | └──c [Bonduv] | └──pk_k [Kondu] | └──pubkey2 [03ba991cc359438fdd8cf43e3cf7894f90cf4d0e040314a6bba82963fa77b7a434] └──older [Bz] └──52560

(We modify drawTree() function to display the actual value next to each variable.)

With the help of this wonderful btcd library, we can now build the actual script. It looks very much like scriptStr() above, but encodes it into a byte string, focusing on the actual problem of pushing integers and data onto the stack. We use a cut-down version here. The full version is available on this site .

 // Script creates the witness script from a parsed miniscript.func (a *AST) Script() ([]byte, error) { b := txscript.NewScriptBuilder() if err := buildScript(a, b, false); err != nil { return nil, err } return b.Script()}// collapseVerify is true if the `v` wrapper (VERIFY wrapper) is an ancestor of the node. If so, the// two opcodes `OP_CHECKSIG VERIFY` can be collapsed into one opcode `OP_CHECKSIGVERIFY` (same for// OP_EQUAL and OP_CHECKMULTISIGVERIFY).func buildScript(node *AST, b *txscript.ScriptBuilder, collapseVerify bool) error { switch node.identifier { case f_0: b.AddOp(txscript.OP_FALSE) case f_1: b.AddOp(txscript.OP_TRUE) case f_pk_h: arg := node.args[0] key := arg.value if key == nil { return fmt.Errorf("empty key for %s (%s)", node.identifier, arg.identifier) } b.AddOp(txscript.OP_DUP) b.AddOp(txscript.OP_HASH160) b.AddData(btcutil.Hash160(key)) b.AddOp(txscript.OP_EQUALVERIFY) case f_older: b.AddInt64(int64(node.args[0].num)) b.AddOp(txscript.OP_CHECKSEQUENCEVERIFY) case f_after: b.AddInt64(int64(node.args[0].num)) b.AddOp(txscript.OP_CHECKLOCKTIMEVERIFY) case f_and_b: if err := buildScript(node.args[0], b, collapseVerify); err != nil { return err } if err := buildScript(node.args[1], b, collapseVerify); err != nil { return err } b.AddOp(txscript.OP_BOOLAND) case f_wrap_c: if err := buildScript(node.args[0], b, collapseVerify); err != nil { return err } if node.props.canCollapseVerify && collapseVerify { b.AddOp(txscript.OP_CHECKSIGVERIFY) } else { b.AddOp(txscript.OP_CHECKSIG) } case f_wrap_v: if err := buildScript(node.args[0], b, true); err != nil { return err } if !node.args[0].props.canCollapseVerify { b.AddOp(txscript.OP_VERIFY) } // More cases [...] default: return fmt.Errorf("unknown identifier: %s", node.identifier) } return nil}

Let's run it:

 func main() { // [...] script, err := node.Script() if err != nil { panic(err) } fmt.Println("Script", hex.EncodeToString(script))}

output:

 Script 2103469d685c3445e83ee6e3cfb30382795c249c91955523c25f484d69379c7a7d6fac73642103ba991cc359438fdd8cf43e3cf7894f90cf4d0e040314a6bba82963fa77b7a434ad0350cd00b268

According to BIP141 and BIP173 , the P2WSH address should be the bech32 encoding of 0 <sha256(script)> , where 0 means Segregated Witness version 0. We'll use this btcd library to help us create addresses:

 addr, err := btcutil.NewAddressWitnessScriptHash(chainhash.HashB(script), &chaincfg.TestNet3Params)if err != nil { panic(err)}fmt.Println("Address:", addr.String())

Our testnet receiving address is ready:

 Address: tb1q4q3cw0mausmamm7n7fn2phh0fpca4n0vmkc7rdh6hxnkz9rd8l0qcpefrj

Funds received by this address will be locked using the spending condition or_d(pk(pubkey1),and_v(v:pk(pubkey2),older(52560))) that is, public key 1 can be spent at any time; public key 2 can be used when funds enter This address is spent after 52560 blocks (about a year).

Click here to run the code in the live Go environment

in conclusion

As a comment, we've created a Miniscript code library that parses Miniscript expressions and performs type checking, and generates payment addresses.

There are many more situations to consider. In the next article, we will learn how to generate witness data based on Miniscript expressions so that funds can be spent; how to ensure that Miniscript complies with Bitcoin's consensus and standards, such as script size and operator restrictions.

If you want this series to continue, please let me know on Twitter: @benma .

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments