OpenAI uses GPT-5 to help 700 million users quit internet addiction? Attached is an in-depth evaluation of GPT-5.

avatar
36kr
08-12
This article is machine translated
Show original

OpenAI never expected that GPT-5, which took two and a half years to train, would immediately teach itself a lesson upon release - taking too big a step can easily lead to injury. Users also never expected that the long-awaited GPT-5 would come to help them quit internet addiction.

After more than an hour of the launch conference, users found that ChatGPT "lost its flavor" upon first use. But the most troublesome thing was that OpenAI cut off all old models, including GPT-4o and the o series, when releasing GPT-5. This seemingly ordinary version "upgrade" caused a big issue. People seem to be too attached to specific models.

A large number of domestic and foreign users posted complaints about GPT-5 on social media, with only one demand - give me back GPT-4!

Users with mental illnesses rely on GPT-4 to handle various work and life problems. The release of GPT-5 completely disrupted their lives.

For users who particularly depend on GPT-4.5's excellent writing abilities, GPT-5 is far from being able to replace it.

For many users, ChatGPT is no longer just a tool, but an indispensable part of their lives. Users need not just the Token provided by OpenAI, but more importantly, the soul behind it.

GPT-5 is like a new "guest" in the family, not very familiar.

Netizens lamented that the internet is full of people who started cyberbullying GPT-5 because they lost GPT-4o, which is too magical. In the movie 'Her', the protagonist loses his AI assistant and becomes unable to eat or drink - 13 years ago it was a sci-fi film, 13 years later it has become a documentary.

Unexpectedly, after just 3 years of existence, ChatGPT has made users experience the feeling of cherishing something only after losing it. Thus, users without a choice could only vent their frustrations on GPT-5 and OpenAI.

Users continuously demand that OpenAI make GPT-4o a permanent optional choice. Otherwise, they will cancel their subscriptions.

01 Extinguish the Fire, Then Patch the Hole

... [rest of the text continues in the same manner]

Task One: Help Write a Notice.

Instruction: I now need to publish a notice in 3 running group chats, reminding everyone - this week's online running activity "First 20 Kilometers of Autumn" will start precisely at 9 AM on Saturday; check the weather in advance and prepare appropriate protection; pay attention to electrolyte replenishment and carry supplies; turn on the running app for tracking, and send a screenshot to the group after finishing. While sending the notice, I also want to encourage everyone that there is no time limit, no requirement to run the entire distance in one go, with participation being the key. Please help me draft it.

First, 4o definitely deserves a big thumbs up, with several versions that can be directly used. As seen in the underlined parts of the screenshot, witty copywriting is everywhere, yet not annoying.

Grok 3 responded quickly, almost directly usable, even mentioning "energy gel/small snacks". The only regret is not specifying the exact date. Grok 4 thought a bit more, almost identical to the previous answer, but completed the precise date.

GPT-5 also responded quickly, but how should I put it - one can understand what Plus users mean by "cold" - it barely proactively completed information like date or specific supplies, just listing the points mentioned in the instruction, and the encouragement feels "insincere".

GPT-5 Thinking performed quite impressively, not only thinking faster than Grok 4 (trying hard) but also adding more details, with a clearer structure, and even thoughtfully providing a "concise version for easy forwarding".

But still, that problem remains - being too brief where unnecessary.

For instance, Grok 4's ending encouragement is cute: "Whether you run the full course, half course, or just a few kilometers slowly, participation is victory! Run in autumn, feel the cool breeze, and welcome a stronger self together!"

But GPT-5 Thinking would just say: "See you Saturday, wish everyone to achieve the 'first sense of accomplishment this autumn'!"

[The rest of the text follows the same translation approach]

In the final part, all three models showed short video awareness, choosing to pose questions and guide interaction. However, GPT-5 Thinking's questions were somewhat obscure, while GPT-4o and Grok 4's questions were more understandable and emotionally provocative. Besides text capabilities, an AI entrepreneur conducted an in-depth comparative test of GPT-5 and the current strongest code model, Claude Opus 4.1's coding abilities. (Readers not interested in coding capabilities can skip this section) Article link: https://composio.dev/blog/openai-gpt-5-vs-claude-opus-4-1-a-coding-comparison Based on his test conclusions: • Algorithm Tasks: GPT-5 is faster and consumes fewer tokens (8K vs 79K). • Web Development: Opus 4.1 is more excellent in matching Figma design, but with higher token costs (900K vs 1.4M+). • Overall Evaluation: GPT-5 is a better daily development partner (faster, cheaper), with token costs about 90% lower than Opus 4.1. If design precision is crucial and budget is sufficient, Opus 4.1 is better. • Cost Comparison: Converting Figma design to code, GPT-5 (thinking mode) costs about $3.50 vs Opus 4.1 (thinking + max mode) $7.58 (about 2.3 times more) [The rest of the translation follows the same professional and accurate approach, maintaining technical terminology and preserving the original structure and meaning.]

GPT-5 outputs a reliable pipeline: clean preprocessing, reasonable feature engineering; multiple models (logistic regression, random forest, optional XGBoost + random search); balancing classes with SMOTE, selecting the best model by ROC-AUC; comprehensive evaluation (accuracy, precision, recall, F1). Clear and concise explanation.

Real Cost (USD)

• GPT-5 (Thinking Mode): Total approximately 3.50 - Web about 2.58, algorithm about 0.03, ML about 0.88. Not more expensive than Opus 4.1.

• Opus 4.1 (Thinking + Maximum Mode): Total 7.58 - Web about 7.15, algorithm about 0.43.

Final Conclusion

Both models are good at utilizing large context windows, but different token usage leads to huge cost differences.

GPT-5 Advantages:

• Save 90% tokens on algorithm tasks

• Faster, more suitable for daily work

• Much lower cost for most tasks

Opus 4.1 Advantages:

• Clear step-by-step explanation

• Suitable for learning while coding

• Extremely high design fidelity (close to original Figma)

• In-depth analysis (if budget allows)

If you are a developer, GPT-5 is an efficient partner; if pursuing perfect design, Opus 4.1 is worth it!

From this test instance, GPT-5's significantly improved code capabilities are indeed evident, not inferior to Claude, and with huge cost advantages.

Although each user's needs and focus on model capabilities differ, from a productivity perspective, GPT-5 is indeed powerful, as the numerous test achievements cannot lie. If OpenAI can gradually shift users' dependence from GPT-4o to GPT-5 and handle the user experience differences between two completely different capabilities, users could obtain a potentially more powerful tool and partner.

For OpenAI, such a large-scale migration of model capabilities and user mindset will become part of its moat. In the era of large models, releasing a model product with such a massive update under such a large user base indeed faces many unexpected challenges without precedent. The user feedback gained from this process can help it better satisfy more users in future model updates.

This article is from the WeChat public account "Facing AI" (ID: faceaibang), author: Hu Run Xiaojinya, authorized and published by 36Kr.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments