AI giants enter the dark forest

This article is machine translated
Show original

Text by Xiang Xianzhi

In his novel *The Three-Body Problem*, Liu Cixin used an image that has been quoted countless times since—the Dark Forest. Every civilization is an armed hunter; whoever exposes themselves first dies. The forest isn't empty; everyone knows that turning on the light will attract bullets, so everyone keeps their lights off.

In the spring of 2026, top AI labs entered this dark forest.

On April 16th, Anthropic released Claude Opus 4.7. On the same day, they made an unusual move— publicly admitting that Opus 4.7's performance was inferior to an unreleased model, Mythos, citing safety concerns.

On April 23, OpenAI released GPT-5.5 on its official website. On the same day, Anthropic published an update on its official blog titled "An update on recent Claude Code quality reports," acknowledging that Claude Code had indeed become less efficient over the past month—one releasing a new version, the other making amends. But this "new champion" almost seemed to be boasting: We admit Claude has temporarily become less efficient—but don't forget, we still have Mythos hidden away.

On April 24, the "mysterious Eastern power" DeepSeek V4 Preview was launched, and Liang Wenfeng's team officially announced the deep integration of the model with Huawei Ascend 950PR for the first time; but everyone understood that the truly "full-blooded" V4 Pro Max will not be released until the Ascend 950 supernode is mass-produced in the second half of the year.

Three companies, three actions. On the surface, it's each company's own product development schedule, but when viewed together, one thing emerges:

Each company holds at least one "gun"—a model more powerful than the publicly available version, a next-generation architecture yet to be released to the public, and a chip supernode that has not yet been widely deployed. But none of them dare to raise this gun first .

Because in this industry, the cost of "revealing" is never as simple as just leaking secrets. Revealing means handing over the upper limit of your capabilities to your competitors as a benchmark; it means being the first to bear the brunt of security scrutiny, tightening regulations, and public pressure; it means turning yourself into a moving target for all competitors in the next round. There is no heroism in the forest—everyone who fires first becomes the next target.

Therefore, the most rational choice for hunters is to turn off the lights, hold their breath, and hide their weapons behind their backs.

This is the optimal solution in the game.

Anthropic's fearlessness

Claude, on the other hand, has had one of the worst releases in the past month.

Despite having updated to Opus 4.7 well in advance, Anthropic continues to dominate various charts and also has Mythos available only to enterprise customers—a seemingly unhurried approach.

However, this period of Opus 4.7 was almost the worst user experience for Claude, with "a flood of negative reviews".

In early March, Anthropic changed the default inference depth of Claude Code from high to medium. The rationale behind this decision is understandable: in high mode, the UI often appeared to freeze, with sluggish responses that drove paid users crazy. The problem, however, was that they didn't announce it at the time.

At the end of March, another "efficiency optimization" was launched—if a Claude Code session was idle for more than an hour, the system would clear old reasoning blocks. This was designed to save computing power. However, in practice, after each round of dialogue, Claude seemed to have amnesia, completely forgetting the context . The developer community was flooded with complaints for weeks: "Claude is starting to forget what I asked it to do in the last round."

Until recently, a third thing happened—a command to compress verbosity was added to the system prompt. According to Anthropic himself, this command directly reduced the encoding quality of Claude Code by 3%.

Three things combined led to the statement written by a senior director at AMD on GitHub: "Claude has regressed to the point it cannot be trusted to perform complex engineering tasks." Axios's April 16th article, "Anthropic's AI downgrade stings power users," brought this to the mainstream attention.

Then Anthropic admitted that there was indeed a problem.

DeepSeek

On April 7, they quietly rolled back the adjustments made to the reasoning effort; on April 10, they fixed the cache bug; and on April 20, they removed the system prompts for compressing verbosity. However, the actual incident review report was not released until April 23—which happened to be the day GPT-5.5 was publicly released.

This dismissive attitude, tinged with "Oh, my engineering strategy has a bug, just fix it," coincided almost exactly with OpenAI's major announcement. It's hard to call it a coincidence.

What's even more intriguing is that when Opus 4.7 was released, Anthropic made an unusual move: publicly admitting that Opus 4.7's performance was inferior to an unreleased model—Mythos. This was clearly a "strategic retreat"—Anthropic was keeping its strongest capabilities for enterprise use and wasn't in a hurry to release them to the general public, because the team wasn't ready to release Mythos yet .

This claim is plausible. However, from a business narrative perspective, the other half is equally true: Anthropic waited six weeks before publicly acknowledging the degradation of Claude Code, and only brought up the issue when OpenAI was about to release new information. If it weren't for immense pressure from peers, and if Opus 4.7 hadn't proven that "we still have a backup plan," this statement might never have come.

On Claude's side, "squeezing toothpaste" doesn't mean deliberately crippling capabilities, but rather that the pace of releasing capabilities and disclosing problems should follow the pace of the competitors.

Showcasing their most cutting-edge capabilities inevitably makes them a target. Or, in Anthropic's view, the pressure from 4.6 on competitors hasn't dissipated yet—so there's no need to play even stronger cards now.

OpenAI's old tricks

If Anthropic is "keeping a Mythos hidden," then OpenAI's approach is even more subtle—it leaves the release of capabilities to its own server load curves and a tiering mechanism called auto-router.

On April 23, the same day GPT-5.5 was released, Simon Willison (co-founder of the Django framework and a well-known independent reviewer in the AI ​​community) wrote a cautious statement on his blog: "It's not a dramatic departure from what we've had before."

DeepSeek

He then added a crucial piece of information: GPT-5.5 is the first fully retrained base model OpenAI has released since GPT-4.5; in other words, the 5.1, 5.2, 5.3, and 5.4 versions released in the past six months were all incremental updates. In other words, OpenAI has been holding back with these four minor version updates—because they didn't know what their competitors would release.

There's an easier way to understand "reducing your efforts in updating": like squeezing toothpaste.

But an even more memorable scene occurred just hours after GPT-5.5 went live. A Codex user filed Issue #19241 on GitHub, complaining that while Fast mode was initially very fast, it visibly slowed down as more users were added, yet billing was still based on the Fast tier. The wording was familiar: "Please investigate whether GPT-5.5 Fast mode is being downgraded under high load."

This is almost a precise replay of what happened on August 7, 2025, the day GPT-5 was first released—on that occasion, Reddit r/ChatGPT pushed "GPT-5 is horrible" to over 4600 upvotes, and Sam Altman himself admitted in an AMA the next day that "the autoswitcher broke... GPT-5 seemed way dumber"—acknowledging that the router had downgraded the firmware for users behind the scenes.

The same script was played out again eight months later.

Even more ironically, the day before the official release of GPT-5.5, OpenAI's Codex mistakenly deployed its internal staging environment to the production environment. Several pro users took screenshots, which were quickly fixed within minutes, but the leaked content had already spread widely. Besides GPT-5.5 itself, the selector at the time included a series called Glacier (with a tooltip stating "Intelligence that moves continents"), a life science model called Heisenberg, an unspecified model called Arcanine, and several other versions codenamed oai-2.1.

In other words, at the same time that OpenAI released GPT-5.5 as the "next generation", there were at least 5 to 6 parallel product lines running internally, none of which had been released to the public yet.

OpenAI itself has acknowledged this. In their official roadmap for 2026, they used a term that has been discussed in academic circles for a long time—capability overhang—to admit that there is a huge gap between the real capabilities of current large models and the actual effects that users can achieve.

Does this sound familiar? It's almost the same line used by Anthropic regarding Mythos. Even if the Codex leak on April 22nd was truly an accident, OpenAI's proactive inclusion of the phrase "capability overhang" in its roadmap sends a very clear message—we have plenty of options, you figure it out.

You can only squeeze out opportunities if you have far more in your hands than you sell to users. GPT-5.5's 24-hour event once again turned this premise into a live broadcast.

Deepseek's patient wait

DeepSeek has completely changed its approach to "squeezing" – it's no longer hiding its capabilities, but waiting for a more suitable time to deliver them.

With 1.6T MoE, 1M context, and Pro/Flash dual specifications, priced at 3.48 per 1M tokens—a fraction of the price of GPT-5.5 and a magnitude difference from Opus 4.7—overseas independent reviewers have concluded in two sentences: performance is close to but slightly lower than GPT-5.4 / Gemini 3.1-Pro, and the price "breaks through the economics of cutting-edge laboratories."

However, within DeepSeek's own framework, V4 Preview is significantly more expensive than the "bizarrely cheap" V3. Everyone knows—this isn't the full-fledged version.

The full story of DeepSeek V4 does not end with its release, nor does it begin with it.

It all started with the unreleased R2 launch in 2025. R2 was originally scheduled for release in May 2025, but was ultimately postponed to fall or winter. The entire infrastructure of Deepseek China was migrated to Huawei's CANN ecosystem. For any lab, this was no project that could be completed in a quarter—the compiler, operators, communication libraries, inference framework, MoE routing—everything had to be rewritten.

With V4, DeepSeek officially includes Ascend in its training hardware inventory for the first time. V4 is the first version for hybrid training—Ascend's debut .

However, the next-generation chip Ascend 950DT, specifically optimized for large-scale training, will not be mass-produced until Q4 2026 according to Huawei's roadmap. In other words, V4 training can run by using the previous generation 950PR; to enable the full-fledged 1.6T MoE version like V4 Pro Max to both train thoroughly and be pushed on a large scale, we still have to wait for the next generation to arrive.

The real engineering challenge is not "whether V4 can be trained"—which it already has—but "how to make V4 run at full speed, stably, and cheaply on Ascend."

The Ascend 950PR will enter mass production in Q1 2026, boasting an FP4 hashrate of 1.56 PFLOPS and 112GB of on-chip memory, specifications that rival and surpass NVIDIA's H20 on paper. However, running a single chip and a whole cluster of supernodes stably serving millions of tokens per second of inference requests are two different things. The full-fledged version of the V4 Pro Max is designed specifically for this "supernode"—the large-scale cluster version of the Ascend 950 series, which will gradually arrive in the second half of 2026.

This constitutes a completely different strategy from the previous two. Anthropic and OpenAI's incremental approach is that they have a stronger version in hand, but they won't share it with you yet; DeepSeek's incremental approach is that they have the full-fledged version and will wait for a moment when the price will drop another level.

This difference is important.

DeepSeek's real killer feature has never been "cutting-edge performance," but rather "cutting the token price to a level that others dare not cut, while ensuring sufficient performance." V4 Preview has been adapted for NVIDIA cards and the Ascend 950PR, but to achieve full-capacity inference at mass production scale, we must wait for supernodes to be available. Once that time arrives, two things will happen simultaneously: first, the capabilities of V4 Pro Max can be fully unleashed; second, inference costs and API pricing will drop another level—for a company that relies on price to penetrate the market, the latter is more fatal than the former.

The "DeepSeek moment" that people were truly looking forward to, which happened in early 2025, did not reappear in this release. The V4 Preview release was actually a teaser; the real highlight was the "DeepSeek + Huawei Ascend" moment in the second half of the year.

From this perspective, Liang Wenfeng's team is not engaging in forced "hiding," but rather making a commercially restrained "choice"—selecting to launch the strongest version in a scenario where it has the most say: the first day after the large-scale deployment of domestic supernodes. Before that, they will use V4 Preview to further solidify the narrative of cost-effectiveness.

DeepSeek

DeepSeek's mission has never been the "long-board narrative" of making domestic large-scale models top the list, but rather the "systemic narrative" of ensuring that the four lines of chip, training, inference, and pricing run smoothly at the same time—the latter is far more important than the former.

Just a few days ago, Jensen Huang said on Dwarkesh Patel's podcast that if DeepSeek were to debut on Huawei's chips, "that would be a horrible outcome for our nation."

Nvidia currently controls top-tier computing power. However, according to Huang Renxun's own vision of a "five-layer AI cake"—energy, chips, infrastructure, models, and applications—China's large-scale AI industry already has viable domestic solutions at each layer, and the gap is narrowing at a visible pace. With the final piece of the puzzle—chips—the story of DeepSeek's open-source large-scale AI is even bigger than the American model: it's a crucial step towards achieving global intelligence equality without incurring excessive costs.

This will allow the entire world to bypass advanced computing power controlled by certain hegemonic powers and enter a highly efficient intelligent society.

end

Anthropic's "hiding" is proactive. They have Mythos, but haven't released them, citing safety as the reason.

OpenAI's "hidden" aspect is its structure. They have a Pro version, but they don't usually offer it to you, citing infrastructure and price tiers as the reasons.

DeepSeek's "concealment" is necessary. It concerns a whole narrative paradigm of the leap in social intelligence.

But from another perspective, this is very similar to the dark forest described by Liu Cixin: in this pitch-black intelligent forest, no top hunter will be the first to fire their shot.

Exposing yourself means having nothing to hide, having no trump cards, and becoming a sitting duck for another hunter.

No one knows who will fire the first fatal shot. But one thing is certain: every model you use today is not its true form.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
86
Add to Favorites
17
Comments