(OpenAI CEO Sam Altman is releasing GPT-5 Image source/OpenAI official website live broadcast)
Every flagship model of the American star AI startup OpenAI will lead the global technology trend for the next half year. On August 7th, US West Coast time, the company released GPT-5.
OpenAI CEO Sam Altman described that GPT-3 felt like talking to a high school student. Although occasionally brilliant, it was also quite annoying. GPT-4o might be like conversing with a college student, possessing genuine intelligence and practicality. Now, with GPT-5, it's like talking to an expert - a professional doctoral-level expert ready to assist in any field at any time, helping you achieve any goal. GPT-5 can not only chat but also do things for you.
GPT-5 is a system composed of two models (a long-thinking version and a high-efficiency version, with the former capable of deep thinking and the latter of efficient Q&A). It will automatically switch versions when users ask questions.
Performance benchmark test results disclosed on the OpenAI website show that GPT-5 surpasses the previous flagship model OpenAI o3, with the long-thinking version of GPT-5 having six times fewer hallucinations than o3. The international market research institution Artificial Analysis, which conducts long-term performance benchmark tests on global mainstream models, shows as of August 8th that GPT-5 is currently the highest-performing model globally.
Along with performance improvements, GPT-5's inference computing power cost has significantly decreased. Test results published on the OpenAI website show that GPT-5 performs better than OpenAI o3, reducing token (AI inference computing power measurement unit, where a Token can be a word, punctuation, number, symbol, etc.) output by 50%-80%.
[The translation continues in the same manner for the rest of the text, maintaining the specified translation rules for specific terms.]Second, video generation models will become mature and available, with an expected breakthrough by the end of this year. This means that Agents will not only understand the world but also generate content and simulate processes in a more dynamic and intuitive way.
Third, the ability to handle complex multi-step tasks will significantly improve, with a major breakthrough expected by the end of this year. This is a key step towards Agent maturity. When models can stably and reliably plan and execute complex tasks involving dozens or even hundreds of steps, the problem of Agent "abandonment" will be fundamentally solved.
In Wu Di's view, most Multi-Agent applications currently "seem like toys", but based on breakthroughs in these three technical lines, he made a final judgment - Multi-Agent application accuracy will significantly improve by the end of 2025. After the popularization of AI applications with visual understanding and reasoning capabilities by the end of 2025, the computing power consumed by a basic task may exceed 10,000 tokens. At that time, token consumption will rapidly climb.
A New Round of Model Competition Begins
The foundation of the model, application, and computing power "flywheel" is continuously improving model capabilities. In 2025, the global tech companies' large model competition will become increasingly intense, with model iteration pace accelerating.
Knowledge iteration in the large model field is measured in "months" or even "weeks". A single paper or model could potentially overturn existing technical routes. A senior algorithm engineer once told Caijing that in the large model field, numerous academic papers are published weekly; almost every month sees new technical breakthroughs; and leading models are almost surpassed every three to four months.
According to Caijing's incomplete statistics, within 220 days from January 1 to August 8, 2025, 11 tech companies participating in model competition (including Alibaba, ByteDance, Tencent, Baidu, Huawei, DeepSeek, Moonshot, Google, OpenAI, Anthropic, and xAI) released or iterated at least 32 large models, averaging a new model every 6.9 days.
Basic model update cycles are becoming even shorter. OpenAI's GPT-4.5 to GPT-5 update cycle is 161 days; OpenAI's o1 to o3 update cycle is 132 days; xAI's Grok 3 to Grok 4 update cycle is 142 days; DeepSeek-R1's two versions have a 128-day update cycle; DeepSeek-V3's two versions have an 87-day update cycle; Google Gemini 2.5's two versions have an update cycle of just 42 days.
[Image links omitted]
The release of GPT-5 will force Chinese and American tech companies to launch a new round of large model competition - training stronger models and acquiring larger-scale computing power, a path that will not change in the short term.
The current development of large models relies on several key cornerstones: first, data; second, algorithms; third, computing power. It depends on "brute force miracles", that is, using massive resource investment to achieve performance improvements.
In June this year, Duke University Electronic and Computer Engineering Professor Chen Yiran told Caijing that the basic route of AI evolution is still "brute force miracles". People have been discussing when this model will reach its peak and when its potential will be exhausted, and the academic community is trying to find new paths. However, currently, there are no other effective methods, so the industry has no choice but to continue along the "brute force miracles" path.
Currently, Chinese tech companies, such as Alibaba's Qwen 3 updated in July this year, have temporarily caught up with OpenAI's o3 released in April. The release of GPT-5 means a new round of catch-up is about to begin.
Caijing learned that Alibaba's large model R&D department - Tongyi Laboratory - has a core goal this year of maintaining leadership in model performance, download volume, and derivative model numbers.
Alibaba Cloud CTO and Tongyi Laboratory head Zhou Jingren told Caijing in June at the ModelScope Developer Conference that model performance must have sufficient competitiveness and be able to prove its strength in authoritative and widely recognized benchmark tests.
He also mentioned that Tongyi Laboratory has always considered tracking and judging global cutting-edge technology trends as part of daily work. They not only pay attention to papers from top AI conferences (AAAI, IJCAI, ICML, NIPS, etc.) but also closely follow global open-source communities, technical blogs, and product releases from top AI companies.
The aforementioned senior algorithm engineer believes that in the large model field, any performance advantage is only temporary, and competition is continuous.
This article is from the WeChat public account "Half-Cooked Finance" (ID: Banshu-Caijing), authors: Wu Junyu, Zhou Yuan, authorized by 36kr for publication.




