Summa
We generally hold cryptocurrency and often conduct currency buying and selling and exchange operations on centralized exchanges. Transactions on centralized exchanges are very convenient and fast. In the more than ten years since the issuance of Bitcoin, many centralized exchanges have emerged on the market, which greatly facilitates users’ cryptocurrency buying and selling operations. However, with the prosperity of centralized exchanges, fraud and malicious behavior such as withdrawing money and running away are common. Netflix has specially filmed a promotional video about the sudden collapse of the exchange and the mysterious death of the founder: Don’t trust anyone, the virtual currency is unsolved [1] . In particular, FTX's declaration of bankruptcy due to insolvency in 2022 shocked the world, and many users and fund companies suffered huge losses.
In fact, not only in the field of cryptocurrency, but also in other traditional fields, there are countless cases of deceiving investors through false accounting. Such as the Enron incident that shocked the world at the beginning of the 21st century. Enron was selected as "America's Most Innovative Company" by Fortune magazine for six consecutive years. However, such a company with hundreds of billions of assets went bankrupt within a few weeks in 2002. Another example is Evergrande Company, which diverted the funds originally used to build buildings for other purposes. Eventually, countless owners were saddled with 30-year mortgage loans and waited for unfinished houses.
An important reason behind the endless fraud in many different fields is that the auditing and accounting work itself is not completely open and transparent, which creates huge room for corruption and cheating. However, in order to protect the company's interests and investor privacy, key financial data cannot be completely open and transparent. Therefore, there has been no good solution to financial fraud except strengthening supervision. However, as zero-knowledge proof technology gradually matures, we can see that to a new solution.
If each user can verify whether some of his or her assets have financial fraud, then as long as there are enough verified users, it will be very difficult for an organization to commit financial fraud. And because the verification process is zero-knowledge, even if the third party intercepts the data during the network transmission of the verification data, he does not know what the user's real assets are. In this way, the verification process can be completely implemented in the network. Displayed publicly on the Internet. With the help of zero-knowledge proof technology, a considerable amount of audit work is no longer the patent of certain organizations like the Big Four accounting firms, and is conducted secretly behind closed doors. Rather, it is an open process in which any stakeholder can participate.
Summa [2] is a PSE research project that aims to use zk method to verify user assets. The following content of this article will briefly introduce the project and the technical implementation principles.
contract
The overall data flow of Summa is shown in the figure. Smart contracts are mainly used here to store and verify some public data. It is not necessarily strongly bound to Ethereum, even if it is deployed on other blockchains such as Solana in the future (as long as Halo2 has corresponding blockchain support, some contracts in this project are generated by Halo2 proof [3] ).

In the figure, Custodian represents a centralized exchange. The contract is deployed on the chain by the exchange, and the contract ownership belongs to the exchange. Public data can only be submitted by the exchange. Public data consists of two parts. One part is the basic information of the chain controlled by the exchange and digital signatures.

The second part is the information on the assets on the chain, including the root hash of the Merkle sum tree, and the root balance (the specific number of assets on the chain, such as how many BTC). This part of the data will be used for the instance input of the zk proof.

Both parts of the data are easily publicly available on the chain and can be queried (except for the root hash). It would be difficult for exchanges to cheat on this data. Anyone can compare the data stored in the contract with the actual data results on the blockchain address.

Proof generation is currently generated by the exchange. The user submits key information that needs to be verified to the exchange, and then the exchange generates the proof and returns it to the user. Users can request the smart contract with this proof, and the contract will verify the proof.
Finally, Proof Verify is where users interact directly with the contract. This part of the contract is translated into Solidity code by the Halo2 circuit and then deployed on the chain as a separate verification contract.

In actual use, it can be calculated and verified on the chain by passing in the proof.

The following code example is the function used in the actual Hash operation. You can see that the entire hash calculation itself does not use bit operations, only addition and multiplication calculations, which is ZooKeeper friendly.

The difference between zk data calculation and traditional program calculation is that zk data must be calculated in a finite field. When building a Merkle Tree, it is necessary to ensure that the data of each node cannot exceed the finite field. For this reason, the data must be ranged before calculation. Check for future data overflow.
The general principle of range check is similar to the example below. First, for the input data, use 8 bits as the length unit and cut out multiple copies to facilitate future subtraction operations. Then every time you perform next numerical calculation, follow the calculation steps in the example below. When the range check circuit makes constraints, it actually makes constraints based on the intermediate result diff = z_cur - z_next * Expression::Constant(Fp::from(1 << 8)) The diff is required to be within an 8-bit value. . In this way, for a 32-bit data constraint, only 4 calculation cells and 256 lookup tables are needed. In the future, the instance constraint with the highest bit being 0 is enough. If it were not designed in this way, simply doing a range check of 32-bit values would require a large lookup table. Obviously, such a circuit would be too large and cannot be put into practical use.

With these auxiliary structures, the Merkle Sum Tree can be formally constructed. Each user input data is called Entry, and its structure is:

The hash calculation method of the Merkle Tree intermediate node is H(LeftChild.balance[0] + RightChild.balance[0], LeftChild.balance[1] + RightChild.balance[1], ..., LeftChild.balance[N_CURRENCIES - 1] + RightChild.balance[N_CURRENCIES - 1], LeftChild.hash, RightChild.hash) , so the actual length of the vec array to be calculated is N_CURRENCIES + 2 .
Let's build the entire Merkle Tree completely. The leaf node part is relatively simple, you only need to convert Entry to Node. Intermediate nodes need to be constructed layer by layer, and the value of each intermediate node is related to the value of the left subtree and right subtree of the next layer. Finally, put the calculated intermediate node results into the tree array:

Next we use Merkle Tree to build zk proof. What zk wants to prove is that a certain user's entry is indeed on the Merkle Tree. Therefore, we first need to formulate a specific entry index and generate the data structure required for zk proof based on the Merkle Tree.

We can cleverly achieve the constraint effect of making the entire equation equal to 0 by setting 0 and 1.

The Merkle Tree zk constraint mainly has two parts. One part is the swap constraint to ensure that the exchange is indeed generated in the true order of the above figure when generating the proof. The other part is the balance constraint, that is, the balance of the parent node does come from the sum of the left child node and the right child node. The sum constraint on balance is relatively simple. Here we focus on the swap constraint.
The Merkle zk tree data structure sibling_middle_node_hash_preimages we introduced before is an array and does not contain position information. Whether the position of the dotted box in the picture should be on the left or right side of the tree must be judged by the 0 and 1 of path_indices . Therefore, we must ensure that when the value is 0, the generated parent node is on the left, and its corresponding sibling node of the same level is on the right, and when it is 1, the opposite is true. When data is imported into the zk circuit, this logic can be easily implemented in code:

The overall Merkle Sum Tree zk proof process is to process the data layer by layer, and then input the corresponding position data into the zk constraint circuit for inspection. Build the entire Merkle Tree. The final output is the root node hash and the total balance, which should be consistent with the corresponding data in the contract. Use instance to verify the consistency. If all verifications are correct, the entire Merkle Tree proof work is concluded.
Solvency Verify
The process of generating proof is for the user to request the exchange, and then the exchange returns the proof data, and then the user verifies it with the smart contract. At present, the project does not support users to bypass the exchange to generate proof by themselves, but it feels that it may be a direction that can be explored in the future. Proof is generated directly by the user rather than returned by the exchange. The entire halo2 proof circuit can be packaged into a web assembly using rust, and then the corresponding interactive API can be made using ethers rs [7] . The time complexity of Merkle Tree root verification is log(n). It may not take too much time for the user's device to verify, which further enhances the security of decentralization.
refer to
[1]
Don’t trust anyone, virtual currency mystery: https://www.netflix.com/hk/title/81349029
[2]
Summa: https://github.com/summa-dev
[3]
Halo2 proof generated: https://github.com/privacy-scaling-explorations/halo2-solidity-verifier
[4]
Poseidon Hash: https://github.com/ingonyama-zk/poseidon-hash
[5]
Parameter preparation stage and Hash operation stage: https://autoparallel.github.io/overview/index.html
[6]
Linear-feedback shift register: https://en.wikipedia.org/wiki/Linear-feedback_shift_register
[7]
ethers rs: https://github.com/gakonst/ethers-rs
Recommended reading
ZK Insights Series
zkp study notes
Wen Building Podcast
Research and analysis series of articles
Uniswap fine translation series
Antalpha Labs is a non-profit Web3 developer community dedicated to promoting the innovation and application of Web3 technology by initiating and supporting open source software.
Official website: https://labs.antalpha.com
Twitter: https://twitter.com/Antalpha_Labs
Youtube: https://www.youtube.com/channel/UCNFowsoGM9OI2NcEP2EFgrw
Contact us: hello.labs@antalpha.com




