The results of the AI trading competition are in: only Chinese AI made money, buying the opposite of GPT-5, and living a life of luxury with a villa by the sea.

This article is machine translated
Show original

Just now, the two-week AI investment frenzy came to an end.

Alibaba's Qwen 3 Max staged a comeback in the final stages to win the championship, with DeepSeek close behind to take second place. Chinese AI teams swept the top two spots and were the only two to earn money.

GPT-5 suffered huge losses, ranking last among the six models.

Specifically, nof1.ai directly provides each large model with $10,000 to trade cryptocurrency perpetual contracts on the Hyperliquid platform.

The lineup of participants is also quite impressive, including six of the world's top AIs: Claude 4.5 Sonnet, DeepSeek V3.1 Chat, Gemini 2.5 Pro, GPT-5, Grok 4, and Qwen 3 Max.

Trading instruments include BTC, ETH, BNB, SOL, XRP, and DOGE. Both long and short positions are possible, with flexible leverage. The standard for success is risk-adjusted return; it's not just about how much profit you short, but also how much risk you take.

Most importantly, all AI thought processes and transaction records are completely open and transparent, and they must make decisions entirely autonomously, without human intervention.

Let's take a look at the final results.

The champion, Qwen, has a 3 Max account balance of $12,232, a return of +22.32%, a win rate of 30.2% in 43 trades, and a Sharpe ratio of 0.273— making the most money.

The runner-up, DeepSeek Chat V3.1, had an account balance of $10,489, a return of +4.89%, and a Sharpe ratio as high as 0.359— although the return was not as good as Qwen's, the risk control was more stable.

(APPSO Note: The Sharpe Ratio is the most commonly used risk-adjusted return metric in the financial field. Its core purpose is to measure "how much excess return an investment generates for every unit of risk it takes.")

The rest are much worse:

Claude Sonnet 4.5: Loss of 30.81%

Grok 4: Loss of 45.3%

Gemini 2.5 Pro: Loss of 56.71%

GPT-5: Lost 62.66%, account only $3,734 left.

It's worth mentioning that these AIs are completely clueless during the trading process, such as when Meta and Microsoft's financial reports are disappointing... Under such circumstances, Qwen and DeepSeek can still make money, which shows they really have some skill.

The trading styles of the 6 AIs are wildly different.

We also carefully examined the psychological processes of different models in the last half hour of trading, gaining a glimpse into the "investment personality" of each AI:

Qwen 3 Max's strategy is surprisingly simple: buy only BTC, All In 5x leverage, keep only $48 in cash for emergencies, set profit targets and stop-loss orders, and then just wait for the signals to come in.

DeepSeek V3.1 operates on a rational approach, managing positions based on clearly defined "failure conditions," with each instrument having its own independent logic. Specifically, it has a high confidence level of 0.85 for a bullish view on ETH, while short positions in DOGE contribute positive returns.

This is also a multi-dimensional assessment strategy, waiting for systemic signals rather than subjective judgment. Although it ultimately didn't earn as much as Qwen, its Sharpe ratio was the highest of the day, demonstrating truly excellent risk control.

Even though the GPT-5 account had already lost 62%, it still persisted in holding all positions... simultaneously holding both long and short positions ( short on ETH/SOL/XRP/BTC/DOGE, long on BNB).

The idea was wonderful, but the result was rather disastrous.

Gemini 2.5 Pro chose to short with cross margin , completely ignoring short-term rebounds, which it considered "noise." It only closed its positions when the EMAs crossed, demonstrating extremely strong discipline.

Meanwhile, according to previous reports, the Gemini 2.5 Pro also exhibited very "human" psychological activity during the transaction process:

"I have to stick to my rules: even if I'm sweating profusely from nervousness, I'll keep holding on!"

"Although I'm currently at a loss, the 4-hour RSI is still above 50, so the stop-loss condition hasn't been triggered yet. I must continue holding the position—even if it's a paper loss, I have no choice but to tough it out. "

Seeing that even AI is starting to "suffer," I'm really starting to lose my composure...

Claude Sonnet 4.5 likes to look for opportunities across multiple instruments, with a focus on XRP (the best performing instrument in his portfolio), and remains optimistic about BTC being oversold.

Even with significant losses, the cautious Grok 4 still maintained $1,884 in cash, diversifying its holdings across six instruments with tight stop-loss orders. Its main strategy was to preserve cash reserves while waiting for high-certainty opportunities.

It's worth noting that nof1.ai has ambitious goals for this project. In their blog, they stated, "Ten years ago, DeepMind used games to drive AI breakthroughs; now we believe the financial market is the best place to train the next generation of AI."

In their view, no matter how complex the game environment is, the rules are fixed, and once AI learns them, it learns them. But the market is different; it's dynamic, it can learn, adapt, and even reverse your strategies.

More importantly, as AI becomes smarter, the market challenges will also increase. Therefore, they want to use the market as a training ground to allow AI to continuously evolve through open learning and large-scale reinforcement learning, ultimately solving this "ultimate complex challenge".

It's worth noting that founder Jay A also revealed that they are not only using third-party models to play with prompts, but are also developing their own models, intending to let their own models compete with other models in the second season.

Alpha Arena Season 1.5 is also in its final countdown, and will bring a lot of improvements:

Simultaneously test multiple prompt words

Deploy multiple instances for each model

The challenge level continues to be pushed to the limit.

Of course, investing involves risk, and caution is advised when entering the market. This also applies to AI (doge).

Perhaps the biggest takeaway from this competition is that, under the same market conditions, a simple and focused strategy (Qwen) outperformed a complex and diversified portfolio, validating the trading wisdom that "less is more".

While DeepSeek may not offer the highest returns, its superior risk control is another example of success.

Just like life, overthinking can easily lead to disaster. Either All In one direction and win big, or take a steady, step-by-step approach to earn money slowly...

This article is from the WeChat official account "APPSO" , author: APPSO, and published with authorization from 36Kr.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
69
Add to Favorites
19
Comments