How did Binance's risk control team, in collaboration with academia, propose a new detection system based on "AI + Blockchain Graph Analysis" for detecting Sybil addresses?
Written by: Nicky, Foresight News
Recently, Binance's risk control department, in collaboration with Zand AI and ZEROBASE, published a paper on Sybil attacks. To help readers quickly understand the core content of the paper, the author has summarized the key points after carefully reading the paper.
In cryptocurrency airdrop activities, there is always a group of special players operating in the shadows. They are not ordinary users, but rather use automated scripts to create hundreds or even thousands of fake addresses - these are the notorious "Sybil addresses". These addresses parasitically attach to airdrop activities of well-known projects like Starknet and LayerZero. They consume project budgets, dilute rewards for real users, and fundamentally undermine the fairness of blockchain.
Facing this ongoing technological cat-and-mouse game, Binance's risk control team, in collaboration with academic institutions, developed an AI detection system called "Subgraph-based lightGBM", which achieved a 90% accuracy rate in real data testing.
The Three "Identification Cards" of Sybil Addresses
Why can these cheating addresses be precisely targeted? By analyzing transaction records of 193,701 real addresses (of which 23,240 were confirmed as Sybil addresses), the research team discovered three types of behavioral traces:
Temporal fingerprints are the primary vulnerability. Sybil addresses have an eerie "precisely timed" characteristic: from first receiving gas fees to completing the first transaction and participating in airdrops, these key steps are typically completed in an extremely short time. In contrast, real users' operation times are randomly distributed, as no one would create an address specifically for one airdrop and immediately abandon it.
Fund trajectories expose their economic motives. These addresses always maintain a balance just enough to get by: slightly above the minimum airdrop threshold (to save on funding costs), and quickly transfer out once rewards are received. More obviously, when operating in batches, their transfer amounts show high consistency, unlike real user transactions with natural variations.
Relationship networks become the ultimate evidence. By constructing transaction graphs, the team observed three typical topological structures:

- Star network: A "command center" distributes funds to dozens of sub-addresses.
- Chain structure: Funds are passed like a relay baton between addresses to forge activity records.
- Tree-like dispersion: Using multi-layer branch structures to attempt to evade detection.
These patterns expose the collaborative nature of programmatic operations, which are also the most difficult features for traditional detection methods to mimic.
Two-Layer Relationship Network: AI Detective's Crime-Solving Tool
Tracking blockchain transaction data is like finding a needle in a haystack. The research team used a two-layer transaction subgraph model - like a detective investigating not just the target individual (Address A), but also their direct contacts (addresses that transferred to A, addresses A transferred to) and the connections of these contacts (second-degree relationships).
More importantly, they created an innovative "feature fusion technique": the system aggregates the behavioral characteristics of neighboring addresses into a target address's "behavioral profile". For example, calculating the minimum, maximum, average, and volatility of transfer amounts for all associated addresses, forming a composite indicator describing fund flow patterns; or calculating the in-degree and out-degree (number of associated addresses) to judge network density. This design allows the system to remain efficient when analyzing over 5.8 million transactions, avoiding the computational disaster of traditional methods tracking network-wide data.
Practical Verification: Capturing "Ghosts" in Binance Airdrops
This system was tested in the real airdrop data of Binance's Soulbound Token (BAB). BAB, launched by Binance in 2022, is used to verify the identity of KYC-completed real users, making it an ideal testing ground for detecting Sybil behaviors.
The team first manually analyzed and clustered suspicious addresses, establishing an appeal review mechanism to confirm the final Sybil address labels. When cleaning data, they excluded institutional addresses (such as exchange hot wallets), smart contracts, and addresses existing for over a year (Sybil addresses often abandon old addresses to avoid detection), ensuring the purity of the dataset.
The results showed high precision in identifying three types of cheating networks:
- Star network identification rate of 99% (previous method's max was 95%)
- Chain structure identification rate of 100% (previous method's max was 95%)
- Tree-like dispersion identification rate of 97% (previous method's max was 95%)
All four core indicators broke through 0.9: precision reached 0.943 (previous best model was 0.796), recall reached 0.918 (meaning over 91% of Sybil addresses were captured), F1 comprehensive score reached 0.930, and AUC value reached 0.981 (near-perfect classification). This means project parties can significantly reduce the risk of harming real users while blocking cheating loopholes.
Technical Boundaries and Future Battlefields
The current technology is mainly applicable to long-term airdrop scenarios (such as phased distribution of soulbound tokens), as these activities can accumulate sufficient labeled data for AI learning. In terms of blockchain compatibility, it supports Ethereum Virtual Machine (EVM) compatible chains (such as BNB Chain, Polygon), and is not currently suitable for UTXO model chains like Bitcoin, though the paper notes that high gas costs make airdrops rarely conducted on UTXO chains, with limited practical impact.
The research team emphasizes that the potential of this technology extends far beyond airdrops. By identifying abnormalities through transaction networks and behavioral patterns, it can also be applied to:
- Detecting market manipulation behaviors (such as coordinated addresses in pump and dump schemes).
- Assessing token liquidity risks (identifying fake trading pairs).
- Constructing on-chain credit scoring systems.
As Sybil attack strategies continue to evolve, this technological arms race to protect Web3 fairness will drive detection systems towards more intelligent and universal iterations.
Original link: https://arxiv.org/abs/2505.09313



