Just now, the Gemini 3.5 was leaked ahead of schedule!
According to the latest news from user Lentils, the Gemini 3.5 Pro checkpoint codenamed "Cappuccino" has begun production.
Just a few hours ago, the rumor was that it was Gemini 3.2, but unexpectedly it has been replaced by Gemini 3.5.
By skipping a generation in the naming convention from 3.2 to 3.5, Google clearly wants to tell a bigger story about I/O.
Major Gemini update: Google unleashes its killer app.
The day before, well-known leaker can was the first to show off the first batch of output.
One is an interactive blueprint disassembly of the DualShock 4 controller, and the other is a vector illustration of a pelican riding a bicycle, complete with a 7D custom panel where frame color, lighting, headgear, basket content, and pedaling speed can all be switched in real time.
Judging from the screenshots, this is no longer a simple SVG, but a complete interactive web application generated by prompt!
Abacus.AI CEO Bindu Reddy then released even more explosive data—
3.2 Flash achieves 92% of GPT-5.5 in encoding and inference, but at a cost 15 to 20 times cheaper.
Furthermore, Google's brand-new all-time agent, "Gemini Spark," has also been revealed.
As you can see, it can not only be on standby 24/7 to help you manage emails and run tasks, but it may even place orders for you without you asking.
However, at this very moment, Alex Heath's exclusive revelation poured cold water on the situation—
The performance of the new Gemini can at best match OpenAI's GPT-5.5...
One prompt with four solutions: Gemini's "laziness" is cured.
Let's look at the good news first.
Previously, when Gemini generated SVGs, the most common complaint in the community was simply, "lazy." Given a prompt, it would produce a perfunctory result.
But this time it's different.
With just a simple hint, user Lentils provided Gemini with four distinct and highly detailed Robot SVGs.
The leaked 3.5 Flash memory at the same time also confirms this trend.
Anonymous benchmarking by LM Arena shows that Flash has surpassed 3.1 Pro in SVG generation, interactive 3D encoding, and animation processing.
In other words, Google's distillation and sparsification techniques are paying off, compressing cutting-edge models into lightweight versions without causing a significant quality precipitate.
Google Agent is incredibly bold, managing your emails and paying for your money.
Another major leak on the same day was "Gemini Spark BETA".
According to the leak, Spark is positioned as "your daily AI agent, on standby 24/7".
An AI agent that operates 24/7 helps you manage your inbox, execute online tasks, and manage multi-step workflows.
Spark's list of data sources is breathtaking.
Connected Google apps, skills modules, chat history, scheduled tasks, websites you're logged into, Personal Intelligence, and location information.
Gemini will share your name, contact information, files, preferences, and other information with third parties to complete tasks.
In addition, to maintain session continuity, the system also saves remote browser data, including login credentials and remote code execution data.
However, it is worth noting that while Spark is designed to request permission before performing sensitive operations, it "may share your information or complete purchases without asking."
In other words, it might place an order without asking you, or it might share the information without asking you.
Spark was originally an upgraded version of Google's Agent, codenamed "Remy," which was previously only available to AI Ultra subscribers.
From Remy to Spark, Gemini's Agent has evolved from "a single function" to a "24/7 digital life concierge".
This directly competes with Anthropic's upcoming managed agent Conway and OpenAI's already launched 24/7 agent platform.
Six months ago, they were at the top; six months later, they can't even touch the forefront.
That concludes the good news.
According to confirmations from multiple sources obtained by Alex Heath, the new Gemini, to be released next Tuesday, will likely fall within the GPT-5.5 range, significantly behind the Mythos.
Back then, the newly released Gemini 3, with its LMARaena 1501 Elo processor, practically swept the top spot on all major leaderboards.
Six months later, with the release of GPT-5.5, Opus 4.7, and Mythos, the landscape has been completely rewritten.
According to the evaluation by the UK AI Security Institute, Mythos is the first model to pass both of its cybersecurity test scopes simultaneously, while GPT-5.5 only passed one.
AISI even admitted that its evaluation framework is falling behind Mythos's capabilities.
Returning to Google, according to the latest interface of the model selector found by user Fandu, the new Gemini will likely natively support the integration of third-party tools like MCP, and the Thinking mode will also be completely redesigned.
As you can see, in addition to the well-known models such as 3.1 Flash-Lite, 3 Flash, and 3.1 Pro, there is a new category that we have never seen before: "MCP Tool Testing", which means "models that can be used for MCP tool testing".
The thinking mode has also changed from the original independent thinking mode to a global switch, with two levels: Standard (suitable for most problems) and Extended (for solving complex problems).
Programming, the battlefield that causes DeepMind the most anxiety.
Heath's revelations were most strongly worded regarding the programming aspect.
He said that DeepMind is facing real pressure, especially in terms of catching up in terms of programming capabilities.
The target is clear: Anthropic. Over the past year, Claude has firmly established himself as the default choice among developers.
The new Gemini will include programming improvements, but no one in Heath's sources believes it will bring about a qualitative change.
Google's AI programming platform Antigravity is widely used internally, but it has failed to break into the external market.
A 6% developer adoption rate in 4 months isn't slow for an IDE, but it's significantly slower than the momentum of Claude Code and Codex.
Where is the problem?
An XDA monthly review tested three tools to perform the same task.
Claude Code accurately understood the complex creative hints on its first try. Antigravity's output, on the other hand, resembled a doodle made with Microsoft Paint.
In addition, Antigravity's pricing strategy is also a headache for developers.
Google has adjusted its pricing model multiple times, from free previews to a credit system, and complaints on community forums about not receiving reminders when credits are used up have been constant.
But the most crucial point is that AI programming has now completely broken out of its niche.
Whether it's Claude Cowork or OpenAI's Codex, both are so easy for people who can't code to use that they can run incredibly smoothly.
Product managers describe requirements using natural language and directly obtain a working prototype. Designers then submit their Figma drafts and receive the front-end code.
However, so far, no Google product has been able to enter this conversation.
However, Haider's comments offered another perspective.
Google may not be aiming to win by competing on the same track as others; their greater focus is on building a more powerful multimodal system, which will take time.
The flywheel to ASI: all three companies flooring the accelerator simultaneously.
Although the model cannot catch up, Google has a billion-level distribution portal and an 24/7 agent.
Once Spark is deployed, users' emails, calendar events, shopping data, and browsing data will feed back into the next generation of Gemini training.
This is a strategy that OpenAI and Anthropic find difficult to replicate.
But the competitors weren't idle.
Just yesterday, OpenAI added an UltraFast mode to Codex, increasing speed by 2-3 times, and also launched a subsidy campaign, offering two months of free service to companies that switch within 30 days. Within three hours, 2,000 developers responded.
Anthropic also released Opus 4.7 Fast mode, increasing Claude Code credit limits by 50%.
This subsidy war may seem like a competition to grab developers, but the underlying logic is much deeper.
The development of GPT-5.6 was almost certainly carried out with deep involvement from GPT-5.5. The code written by AI feeds back into the training of AI, and whoever controls the users of the programming tools controls the accelerator of this cycle.
The three companies were all accelerating on three different tracks at the same time.
OpenAI crushes competitors with its rapid iteration speed, releasing a new version every three weeks. Anthropic achieves legendary status through model quality, while Mythos redefines the cutting edge. Google, through distribution and agent-based encirclement, has crammed AI into the phones of a billion people.
No one is slowing down. The flywheel leading to ASI has already started spinning.
For those who use these tools every day, this arms race among the three giants may be the most worthwhile thing to do in 2026.
Subsidies are being increased, amounts are rising, models are becoming more robust, and prices are falling.
The only question is, did you bet your workflow on the right track?
References:
https://x.com/alexeheath/status/2054747125616169229
https://www.testingcatalog.com/google-prepares-gemini-spark-ai-agent-ahead-of-io-launch/
https://x.com/Lentils80/status/2054628116094501377
This article is from the WeChat official account "New Intelligence" , edited by: Sleepy, and published with authorization from 36Kr.



