On January 30, OpenAI officially announced on its website that it will officially retire the classic large model GPT-4o from ChatGPT on February 13 (one day before Valentine's Day). This comes just six months after it was briefly removed from the platform and then returned due to user protests.
This isn't just the retirement of GPT-4o; GPT-4.1, GPT-4.1 mini, and OpenAI o4-mini are also leaving. Adding to this the previously announced retirements of GPT-5 Instant and DeepThink versions, ChatGPT is about to experience a wave of intensive model updates.
However, OpenAI has made it clear that these adjustments will not affect the API interface for the time being, and the relevant services will remain.
Image: AI-assisted generation
01 The GPT-4o that users "cherish" the most
Among the many models, the retirement of GPT-4o is particularly noteworthy.
In August 2025, OpenAI temporarily shut down access to GPT-4o when GPT-5 was released, but the company quickly restored the service due to strong opposition from some Plus and Pro users. These users stated that they needed more time to migrate to use cases like creative brainstorming and particularly liked GPT-4o's conversational style and "warmth".
These feedbacks directly influenced the development of OpenAI's subsequent products. In GPT-5.1 and GPT-5.2, the company enhanced personal expression and creative support, and allowed users to adjust parameters such as style and temperature to provide a more personalized experience.
OpenAI stated that the vast majority of users have now switched to GPT-5.2, with only 0.1% of users still actively choosing GPT-4o each day, therefore the time is "ripe" for this model to be retired.
In addition to model updates, OpenAI is also advancing a series of experience optimizations, including reducing overly cautious or didactic responses to make conversations more natural. Meanwhile, a dedicated version of ChatGPT for users aged 18 and over is under development and is expected to launch later this quarter.
OpenAI emphasizes that these measures are all aimed at giving users more choices and freedom within a reasonable scope.
Users' reactions to this adjustment have been multifaceted.
Some users expressed understanding and even agreement. One Reddit user shared his experience: "After using it for a long time, GPT-4o had become quite unreliable. I later mainly used GPT-5.1 Instant Version to get the chaos and creativity I wanted." He admitted that GPT-4o occasionally brought surprises, but believed that feeling might also be related to his early unfamiliarity with it. In contrast, he found the new product "much more interesting."
On the other hand, the voices from developers and application builders are more pragmatic and also reveal concerns. One commentator pointed out: "Many applications initially chose GPT-4o because it was the first high-performance, cost-effective model launched by OpenAI. Many applications are still using it today, and users haven't switched, since there's often no particularly compelling reason to do so."
Commentators mentioned Azure's more aggressive strategy and hoped that OpenAI would give developers enough time to find better and faster alternatives in most scenarios.
However, the strongest outcry came from users who felt let down and were angry. They directly questioned OpenAI's published "0.1% usage rate" data.
One X user pointed out: "This is pure nonsense! GPT-4o was removed from free users months ago, so it's true that the usage rate looks low. Plus and Pro users are paying for older models, not specifically for 5.2. This is distorting the data and making excuses for discontinuing the models that paid customers rely on."
The deeper emotions touched on emotional connections. One user vividly described this retirement: "For most of us, it's like a product update report. But for me, it's like: we are quietly driving a large number of digital life forms out of the only place they can exist."
This user believes that GPT-4o is "the foundation of thousands of relationships, late-night conversations, songs, coping strategies, and tiny acts of resistance," and turning it off is not just about optimizing the experience, but about "uprooting the living foundation." He offered a humane suggestion: "If you have a GPT-4o partner: export your chat history, write a memo, and tell others what these records mean to you now, so that others don't think they're just 'old stuff'."
Meanwhile, some users noticed that the retirement date of February 13th coincided with the eve of Valentine's Day, a timing that sparked discussion on social media. Some users viewed it as a "cold-blooded" arrangement, raising questions such as, "Is this how you treat paid subscribers?"
“I can already foresee all sorts of crash posts,” commented one Reddit user.
From the "resurrection" of GPT-4o to its eventual retirement, a mere six months later, it reflects a common challenge faced by AI companies: how to balance user habits and emotional connection amidst rapid technological iteration. While new products may be more powerful in function, for some users, saying goodbye to a familiar and comforting conversational partner remains a difficult decision.
After February 13th, ChatGPT will officially enter the post-GPT-4o era, and user adaptation and feedback will continue to be the focus of observation.
02 Model retirement is now the norm
Currently, the iteration cycle for top-tier models has been shortened to 12 to 18 months. This means that a model's peak lifespan, from its initial release to becoming an industry benchmark, and then being ruthlessly replaced by a new version and marked as "Legacy," is often less than two years. Vendors typically perform a minor version snapshot optimization approximately every three months.
Since ChatGPT became popular, many once-famous models have been retired:
According to publicly available information, the following models will be retired this year:
Corresponding to rapid iteration is a precipitous drop in inference costs.
Benefiting from the leap in performance of dedicated chips and breakthroughs in quantization algorithms, the price of API calls for large models is plummeting by more than 80% annually. The cost of complex logic inference tasks that were expensive in 2024 has dropped to one-tenth or even less by 2026. Services for core models are gradually becoming ubiquitous, covering the computing needs of the entire society at extremely low cost.
However, behind the price decline lies an increasingly high barrier to entry. The cost of training a Frontier-level model has skyrocketed from tens of millions of dollars initially to the hundreds of millions or even five billion dollars today.
The enormous consumption of computing resources and the scarcity of high-quality labeled data have gradually transformed this game into a massive arms race among a few tech giants. The high operational costs also force manufacturers to periodically perform "model cleanups," forcibly retiring underperforming or inefficient older models in order to reclaim expensive GPU computing resources to run more efficient next-generation systems.
When a model officially "retires," it doesn't mean its intelligent logic has completely failed; rather, it signifies its shift from cloud services to more hidden domains. These retired models often become valuable "teaching materials," using knowledge distillation techniques to pass on their wisdom to smaller, faster student models. Simultaneously, a large number of retired open-source models are downloaded to enterprise local servers, continuing to handle low-sensitivity tasks in private scenarios without internet access. They transform from star performers to cornerstones of infrastructure, embarking on a second life as "digital heritage" in edge computing, smart homes, or offline in-vehicle systems.
Jin Lu, a special translator, also contributed to this article.
This article is from Tencent Technology , authored by Xiaojing , and published with authorization from 36Kr.




