[Twitter threads] Polymarket Arbitrage Bible

This article is machine translated
Show original

Chainfeeds Summary:

The difference isn't luck. It's the mathematical infrastructure.

Article source:

https://x.com/MrRyanChi/status/2031292099384008810

Article Author:

Roan


Opinion:

Roan: A common pitfall in market prediction is the "single-market fallacy." If you only look at a single market, prices often seem reasonable. For example, consider a market where the question is: "Will Trump win Pennsylvania?" The price for "YES" is 0.48, and the price for "NO" is 0.52, adding up to 1, seemingly perfectly logical and offering no arbitrage opportunity. However, the problem is that real-world events are not isolated. Putting multiple related markets together can expose logical contradictions in pricing. For instance, consider another market: "Will the Republicans lead their opponents by more than 5 percentage points in Pennsylvania?" The price for "YES" in this market is 0.32, and the price for "NO" is 0.68, again adding up to 1, seemingly without issue. However, there is a clear logical dependency between the two. The US presidential election counts electoral votes by state. If the Republicans lead their opponents by more than 5 percentage points in Pennsylvania, Trump, as the Republican candidate, will not only win the state but also win by a large margin. Therefore, a Republican landslide victory is actually a subset of Trump's victory. In other words, if event B occurs, then event A must occur. In probabilistic logic, the probability of a subset event cannot be higher than that of its parent event. If the market price violates this, it means there is an arbitrage opportunity. An intuitive analogy is weather forecasting: Will it rain tomorrow? Will there be a thunderstorm tomorrow? Thunderstorms are always accompanied by rain, so the probability of a thunderstorm occurring cannot be higher than that of rain. If the market price exhibits this illogical situation, traders can simultaneously buy and sell related positions in different markets, thereby locking in risk-free profits. This behavior of profiting by exploiting logical inconsistencies is arbitrage. Detecting arbitrage in complex prediction markets is not a simple matter. Theoretically, if a market has n conditions, then the number of all possible combinations of outcomes is 2ⁿ. This number expands rapidly as the number of conditions increases. For example, in the 2010 NCAA Tournament market, there were 63 games, each with only two outcomes: win or lose. This means the number of all possible combinations of outcomes is 2⁶³, approximately 9.22 × 10¹⁸, or more than nine hundred quadrillion possibilities. If we were to attempt to check every combination one by one using a brute-force search, even if we could calculate 1 billion possibilities per second, it would still take nearly 300 years to complete. This computational complexity is completely unacceptable in real-world systems. The same problem arises in political prediction markets. For example, in the market related to the 2024 US election, the research team found 1576 potentially dependent market pairs. If each market has 10 conditions, then each market pair would require checking 2²⁰ (approximately one million) combinations. Multiplying this by all market pairs would rapidly increase the computational load. Therefore, the quantification system does not use brute-force enumeration but instead uses integer programming to describe the valid results. A set of linear constraints can eliminate a large number of impossible combinations. For example, in the Duke vs. Cornell match market, each team has 7 possible win-loss ratios, totaling 14 conditions. Using a brute-force method, 2¹⁴ (16384) combinations would need to be checked. However, only three constraints are needed to describe all legal cases: First, exactly one of the seven betting odds for each team must be true; second, neither team can win more than five games simultaneously, as that would result in a semi-final match. In this way, the complex problem is transformed into a small number of constraints, significantly reducing computational costs. After identifying arbitrage opportunities, another key issue is how to calculate the optimal trading strategy. Intuitively, a simple approach is to find the nearest no-arbitrage price to the current price and then execute the corresponding trade. However, using ordinary Euclidean distance (i.e., straight-line distance) to measure price differences presents serious problems. In prediction markets, prices actually represent implied probabilities, and changes in probability have different meanings in different intervals. For example, a price increase from 0.50 to 0.60 represents a probability change from 50% to 60%, which is a relatively mild belief update. But if the price increases from 0.05 to 0.15, it means that a nearly impossible event has suddenly become significantly more likely, a change containing much more information. Euclidean distance fails to reflect this difference, treating all 10-cent changes as equally important. Therefore, in prediction markets using the LMSR (Logarithmic Market Score Rule) as the market-making mechanism, a more reasonable distance metric is the Bregman divergence. In this case, the Bregman divergence is equivalent to the KL divergence, which originates from information theory and is used to measure the difference between two probability distributions. A key characteristic of the KL divergence is that any small change is given greater weight when the price is close to 0 or 1. This aligns with market intuition, as price changes near extreme probabilities often imply stronger information shocks.

Content source

https://chainfeeds.substack.com

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments