GPT-4o mini is very cheap! 10 peers compete with OpenAI, who can compete with OpenAI?
Compiled by Li Shuiqing
Editor | Xinyuan
The new version of GPT-4o has dropped to 1 yuan per million tokens . It is still OpenAI that beats OpenAI!
Zhidongxi reported on July 19 that on the evening of July 18, OpenAI launched its cheapest model , GPT-4o mini . We immediately compared the latest pricing of large-model APIs from 10 domestic and foreign manufacturers including OpenAI, and found that other peers are under considerable pressure this time.
The GPT-4o mini API input price is 15 cents (about 1.09 yuan) per million tokens , and the output price is 60 cents (about 4.36 yuan) per million tokens , which is more than 60% cheaper than GPT-3.5 Turbo; but its capabilities have greatly surpassed GPT-3.5 Turbo, achieving a good score of 82% in the MMLU test, and surpassing GPT-4 in terms of chat preference in the LMSYS ranking.
Previously, many developers turned to small models such as Google's Gemini 1.5 Flash and Anthropic's Claude 3 Haiku due to the high prices of large models. Now, these models have been "sniped" by GPT-4o mini.
▲GPT-4o mini is much more cost-effective than other small models (Source: Artificial Analysis)
As shown in the table below, according to Zhidongxi statistics, the current pricing of GPT-4o mini is significantly lower than the Gemini 1.5 Flash 's input price of 2.5 yuan/million tokens and output price of 7.6 yuan/million tokens , and is also lower than Claude 3 Haiku 's input price of 1.8 yuan/million tokens and output price of 9 yuan/million tokens , and its performance crushes them in all aspects.
At the same time, as can be seen from the above table, domestic manufacturers such as Deepin Quest, Zhipu AI, ByteDance, Alibaba Cloud, Baidu, ByteDance, Tencent Cloud, iFlytek, etc. have successively significantly reduced the prices of their models in June, but now their price advantages have also been weakened.
For example, the input price of Alibaba Cloud Qwen-Turbo is 2 yuan/million tokens, and the output price is 6 yuan/million tokens; the input price of Baidu ERNIE 3.5 series is 12 yuan/million tokens, and the output price is 12 yuan/million tokens; the input price of ByteDance Doubao-pro-128k is 5 yuan/million tokens, and the output price is 9 months/million tokens... Compared with GPT-4o mini, the cost-effectiveness is questionable.
OpenAI CEO Sam Altman said that GPT-4o mini is " moving towards intelligence that is so cheap that it cannot be measured ."
▲OpenAI CEO Sam Altman posted on social platform X
According to the OpenAI announcement, the token cost of GPT-4o mini has been reduced by 99% compared to the text-davinci-003 model of GPT-3 , which has relatively basic functions in 2022.
Currently, GPT-4o mini is available on ChatGPT for free and is expected to gradually replace GPT-3.5 .
01 .
Surpassing GPT-3.5 Turbo to become the smallest model
GPT-4o mini has the characteristics of low cost and low latency, and can handle a variety of tasks, such as: chaining or parallel model calls, processing large amounts of context, fast real-time text interaction, etc.
It has a contextual processing capability of 128k tokens, already supports text and visual input in the API, and supports 16k output tokens, and will be expanded to video and audio input/output in the future.
In multiple global authoritative benchmark tests, GPT-4o mini surpassed its own GPT-3.5 Turbo and a number of small models.
On the MMLU Text Intelligence and Reasoning benchmark, GPT-4o mini leads with a score of 82.0% , while Gemini Flash and Claude Haiku scored 77.9% and 73.8%, respectively.
In the MGSM mathematical reasoning test, GPT-4o mini scored a high score of 87.0% , far exceeding Gemini Flash's 75.5% and Claude Haiku's 71.7% .
In the HumanEval encoding performance test, GPT-4o mini also led with an excellent score of 87.2% , while Gemini Flash and Claude Haiku were 71.5% and 75.9% respectively.
In the field of multimodal reasoning , GPT-4o mini scored 59.4% in the MMMU evaluation, also ahead of Gemini Flash's 56.1% and Claude Haiku's 50.2% .
GPT-4o mini significantly outperforms GPT-3.5 Turbo on tasks such as extracting structured data from receipts or generating high-quality email responses based on conversation history.
GPT-4o mini was just released last night, and AI expert Andrej Karpathy said on the social platform X: " The competition for the size of large language models is intensifying... regressing ! I bet we will see very small models, even models at the parameter level of GPT-2 , that are already very good at 'thinking'" and reliable. "
▲AI expert Andrej Karpathy posted on social platform X
02 .
API input price as low as 1 yuan ChatGPT is now available for free
GPT-4o mini is now officially launched and integrated into the Assistants API, Chat Completions API, and Batch API for developers to use.
In terms of cost, the input price of GPT-4o mini is 15 cents (about 1.09 yuan) per million tokens, and the output price is 60 cents (about 4.36 yuan) per million tokens, which is roughly equivalent to the cost of processing about 2,500 pages of standard book content .
OpenAI plans to release fine-tuning capabilities for GPT-4o mini in the next few days.
For ChatGPT users, whether they are Free, Plus or Team Edition, they will be able to experience GPT-4o mini from today, which will gradually replace GPT-3.5 . Enterprise users will also be able to access this upgrade from next week.
OpenAI said that GPT-4o mini inherits the same strict security protection mechanism as GPT-4o. It filters out bad information in the pre-training stage, and uses technologies such as reinforcement learning and human feedback (RLHF) after training to make the model behavior more in line with security policies.
As the first model to apply OpenAI's instruction hierarchy method , GPT-4o mini demonstrates stronger defense capabilities in the API, effectively resisting risks such as jailbreak attacks, real-time injection, and system real-time extraction.
OpenAI will continue to monitor the use of GPT-4o mini and take immediate measures to improve model security if new risks are discovered.
OpenAI attached the names of nine team leaders at the end of the announcement. Among them, Shengjia Zhao, Hongyu Ren, Haitang Hu, Mianna Chen, and Kevin Lu are all Chinese , and they graduated from well-known domestic universities such as Tsinghua University, Peking University, and Tongji University.
03 .
Conclusion: Model size competition reverses and price war intensifies
The price war for large models has intensified. Compared with the text-davinci-003 model of GPT-3, which has relatively basic functions in 2022, the token cost of OpenAI's GPT-4o mini has dropped by 99%, which is a continuation of the climax of the industry price war in June.
Every new release by OpenAI puts pressure on its peers. On the same day, Nvidia and French AI unicorn Mistral jointly released a small cup model called Mistral NeMo, which outperforms Llama 3 8B. The emergence of small models with lower costs and higher performance will promote the seamless integration of AI into more daily scenarios, and also allow the industry to think about the implementation of AI from a different perspective.
This article comes from the WeChat public account "Smart Things" (ID: zhidxcom) , author: Li Shuiqing, and is authorized to be published by 36Kr.




