The smaller Gemini outperformed GPT5.2, simulating a Windows operating system in just one minute.

12-18

This article is machine translated

Show original

Google's release of Gemini 3 Flash has shown the AI community what it means to have both: only children make choices, adults want them all (doge).

One formula to describe this new model: Gemini 3 Flash = Pro-level intelligence + Flash-level speed + lower price .

In terms of speed, it's almost three times faster than the Gemini 2.5 Pro, and in real-world testing, it's incredibly smooth.

In terms of intelligence, it has outperformed a host of top models, including the Gemini 3 Pro and GPT5.2, in several classic tests.

When asked to count the number of fingers in a picture, GPT5.2 immediately answered "5," while Gemini 3 Flash successfully detected the trap and gave the correct answer, "6."

When drawing a pelican riding a bicycle, the Gemini 3 Flash (top right) performs significantly better than the Gemini 2.5 Pro (left) and Gemini 3 Pro (bottom right), and these are the best results obtained after repeated testing.

The visual inspection segment tested whether the models recognized "Google's PR representative," Logan Kilpatrick.

Gemini 3 Flash answered correctly first, while Gemini 3 Pro mistook him for Jack Krawczyk, the former head of Gemini (who left the company in April of this year and joined Meta).

In further tests, the Gemini 3 Flash demonstrated its impressive overall performance.

Although it's called "Flash," it's actually Google's most powerful intelligent agent model to date.

Pay attention! This model is now available to all users worldwide :

Regular users can access it through the Gemini app and Google Search's AI mode; professional developers can use it through the Gemini API in Google AI Studio, Gemini CLI, and Google's new intelligent agent development platform, Google Antigravity.

Meanwhile, enterprise customers can also access its services through the Vertex AI and Gemini Enterprise platforms.

Overall, the Gemini 3 Flash inherits the complex reasoning capabilities, multimodal and visual understanding capabilities, Vibe programming capabilities, and agent task processing capabilities of the Gemini 3 Pro, but with a faster response speed.

Google officially stated that this is their "best model to date in terms of agent workflows".

Without further ado, let's see what Gemini 3 Flash can do and how it performs in practice.

For example, it takes less than a minute to create a fully functional and aesthetically pleasing Windows operating system (video not sped up).

The netizen who shared the test said, "This is a breathtaking model."

It can also be used to directly generate games. The prompts used by netizens are as follows:

Create a Grand Theft Auto 6 game for me using code, and make it as realistic as possible, adding any features you choose.

The game feels right, but the graphics still have room for improvement.

However, if you switch to some simpler games that are a little easier, the effect is quite good.

And the effect of using it to generate a weather card is like:

The design is visibly more sophisticated, and the interactive effects are richer.

Finally, let's do a simple hands-on test and let it generate an introductory website for itself.

After testing, the speed test function on the website works properly and is not just a fancy "showcase" with a front-end:

And clicking the "Experience Now" button does indeed redirect you to Gemini's official website.

After reviewing this, what do you think of the Gemini 3 Flash's performance?

It outperforms the 2.5 Pro in both performance and speed, but at a much lower price.

In addition, official reviews show that Gemini 3 Flash's main selling point is "speeding up without sacrificing intelligence" .

In terms of performance, it not only significantly surpasses the Gemini 2.5 Pro, but also slightly outperforms the Gemini 3 Pro in professional multimodal testing (MMMU Pro) and complex inference testing (ARC-AGI-2).

More importantly, it also breaks the Pareto limit in terms of performance, cost, and speed— it is 3 times faster than the Gemini 2.5 Pro, but uses 30% fewer tokens on average .

In terms of price, the Gemini 3 Flash offers better value for money compared to previous generations of models.

The price is $0.50 per million input tokens and $3 per million output tokens (audio input remains at $1 per million input tokens).

While slightly more expensive than the Gemini 2.5 Flash ($0.30 per million inputs / $2.50 per million outputs), the price remains quite attractive considering its performance and speed.

(The Gemini 2.5 Pro is priced at $1.25 per million inputs and $10 per million outputs.)

With this, the Google Gemini 3 has now gathered all its family members, including the previous Pro and Deep Think versions.

Moreover, regarding the thinking modes, according to the developer documentation, the Gemini 3 Flash has four thinking modes: minimal, low, medium, and high .

Just one glance at the image effect evolution is enough to tell you the difference between these levels (doge):

One More Thing

Interestingly, after the Gemini 3 Flash was released, Google immediately started a live stream of Pokémon Crystal.

The two competitors are Gemini 3 Flash and Gemini 3 Pro .

Although the final results are not yet available, the Gemini 3 Pro appears to be in the lead at this stage.

Some netizens were pleasantly surprised to find that the Gemini 3 Pro seems to have demonstrated a certain system-level thinking ability in games.

Those who are interested can wait and see what happens next~ and be waiting for a plot twist.

Reference link:

[1]https://x.com/OfficialLoganK/status/2001428651121025391?s=20

[2]https://x.com/simonw/status/2001424152763470238?s=2

[3]https://blog.google/products/gemini/gemini-3-flash/

This article is from the WeChat public account "Quantum Bit" , author: focusing on cutting-edge technology, published with authorization from 36Kr.

Source

Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.

Add to Favorites

Comments

Relevant content