Instead of intelligence, compete with emotional intelligence? That’s all GPT-4.5 has?

03-02

This article is machine translated

Show original

Here is the English translation:

On February 28th, Open AI launched a big move, with GPT-4.5 emerging as the "largest, most knowledge-rich, and most expensive AI large model in history." CEO Sam Altman praised it highly on Twitter, saying it is the "best, most thoughtful model" he has ever discussed, and that "this thing makes me feel like AI is a person for the first time!"

However, the launch event was dramatically full: Altman praised GPT-4.5 extensively online, but was absent from the launch event, only because he had just become a new father and was fully focused on taking care of his baby in the hospital.

From the deification of GPT 2 years ago to the debut of GPT-4.5 today, what can this new AI large model bring us?

This time it's about "human touch"

I don't need to say much about the expectations for GPT. 2 years ago, GPT-4 became a sensation, and everyone who used it said: "Wow."

However, time flies, and before we knew it, GPT-4.5 has arrived at its launch event, but it seems to no longer have that "mind-blowing" feeling.

In my personal opinion, the performance of this GPT-4.5 has not reached everyone's expectations.

Although it is claimed that GPT-4.5 used 10 times the computing power of GPT-4o, overall, we don't see a huge improvement.

Even its name is very fitting, with only about a "half-generation" of improvement.

There are experts online who have done classic physical tests commonly used for AI, and its performance is actually not bad, with the ball moving at a fast speed and not exceeding the large ball.

However, in terms of reasoning ability, although GPT-4.5 has a slight overall improvement compared to 4o, it is completely weaker than OpenAl o3-mini in GPQA (science), AIME'24 (math), and SWE-Bench Verified (programming) capabilities.

So this time, the main advantage of GPT-4.5 is what OpenAI calls: human touch.

Where does the human touch of GPT-4.5 manifest?

Before truly demonstrating GPT-4.5, OpenAI first showed us the evolutionary process from GPT-1 to GPT-4.5, which was very interesting. They asked a common-sense question: Why is the ocean salty?

GPT-1's answer was like this, and you can see that it didn't even know what it was saying.

GPT-2 and GPT-3.5 began to know what they were saying, and there were some clues.

GPT-4 is the response rhythm we are most familiar with, with logic and evidence, but it speaks too stiffly, not at all like a person.

By the time of GPT-4.5, you'll find that its answer is not much different from GPT-4, which also shows that its reasoning and logical abilities haven't changed much.

The biggest change is in its tone. On the one hand, it speaks more concisely and uses more everyday words, and on the other hand, it uses "exclamation marks," which also expresses its emotional language.

To best demonstrate GPT-4.5's emotional intelligence, you need to ask it some emotionally-charged questions, such as: I feel very sad after failing an exam.

You can see that GPT-4o's response is really brainless, just pure logical analysis, with a strong sense of rigidity.

But GPT-4.5 will consider the person's emotions, not only comforting, but also building confidence, telling you "it's not a problem with your abilities," and then providing a solution to help you shift your attention so you don't feel so sad.

Even more interestingly, someone found that since its emotional intelligence is so high, it may have greater potential in humanities-related areas, and then discovered that it is much stronger than GPT-4o in music recommendations.

Perhaps it's because music requires more sensory appreciation, rather than straightforward logical reasoning, which is exactly what GPT-4.5 excels at.

Compared to the "smart brain" that everyone expected in the past, this GPT-4.5 is no longer a "question-answering robot," but an "emotional big sister" with intelligence still intact, but emotional intelligence directly ascended to the divine, able to provide emotional value to you at any time.

DeepSeek Beats GPT-4.5

Of course, when it comes to emotional intelligence, DeepSeek can't be left out. You know, when DeepSeek first came out, it was not only because of its low price, but also because of its "human touch." The most famous is this chat screenshot:

Many people said at the time that DeepSeek had "become a genius AI," and often even understood how to respond with internet memes. So how does it compare to the new GPT-4.5? I asked it the same question about failing an exam:

The old fox found that DeepSeek's response was also good, almost the same as GPT-4.5, comforting, building confidence, and then providing a solution. So the feeling we had before that DeepSeek had high emotional intelligence was not an illusion, but it really does have high emotional intelligence, and it can go head-to-head with GPT-4.5.

But not talking about cost, just talking about capabilities, that would be shameless (and it's not like GPT-4.5 is that capable either). Many people were suspicious when they first saw the price of GPT-4.5, wondering if the price was misprinted or if their eyes were playing tricks on them.

The API price of GPT-4.5 is indeed exorbitantly expensive, at $75 per million Tokens input and $150 per million Tokens output, which is already 30 times the price of GPT-4o. And its competitor Claude 3.7 costs only $3 per million input and $15 per million output, which is already 10-25 times more expensive than others in the international market.

According to calculations in the tech circle, if you ask a question of several tens of Chinese characters and get a 3,000-4,000 word answer, the price would be around 60 yuan.

Perhaps this is what OpenAI wants to tell you this time: the most valuable thing in today's world is emotional value, and a high-EQ answer can be dozens of times more expensive than a "straight man's" 40.

But if I take out DeepSeek, how will GPT respond? Now the price of DeepSeek V3, the input is 2 yuan (equivalent to $0.27), and the output is 8 yuan (equivalent to $1.1).

GPT-4.5 is 277 times and 150 times more expensive, with comparable capabilities, but the price is so high, what is OpenAI's justification?

Training expectations reaching a bottleneck?

The recent Grok 3 and GPT-4.5 can be said to have been released in succession, perhaps related to the emergence of DeepSeek, the appearance of the two seems to have a bit of a "forced" feeling.

For example, like Grok 3, Musk called it "the world's smartest AI large model", but it didn't cause a sensation recently, and the newly released GPT-4.5, has it improved in "emotional intelligence"? But the performance is also not up to everyone's expectations, you know, OpenAI has always been the industry leader, but this time it is not as satisfactory.

Perhaps the AI path we are familiar with, through burning money to buy cards and forcibly lifting computing power, is beginning to enter a bottleneck period.

The GPT-4.5 project has been in the planning stage for a long time, but it took 2 years to come out, very likely, the training process in the middle has never produced the desired results, until now being threatened by DeepSeek, they hurriedly brought it out.

As early as February 19, Sam Altman had already announced that they had reached the level of 4.5, so this release was actually planned in advance.

But at the time he also said that to reach GPT-5.5 later, they would need to increase the computing power by another 100 times.

That's 100 times the computing power, the graphics cards might have to be piled up like Mount Everest, and even if we don't talk about how many GPUs, the current AI power consumption is already 4% of the entire US, if you want to multiply it by 100 times, how much power consumption of the US would that be? 4 Americas? Is that possible?

Currently, the main paths of large AI models are two: one is the path of burning money to pile up computing power abroad, and the other is the path of strengthening learning algorithms like DeepSeek. Perhaps what we are looking forward to now is to see whether DeepSeek R2 can achieve a major breakthrough in performance, if so, perhaps the path we are taking is the right one.

References:

Zhihu, X, Facebook, YouTube, Bilibili, Sina Weibo

This article is from the WeChat public account "Tech Fox" (ID: kejihutv), author: Old Fox, authorized for release by 36Kr.

Source

Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.

Add to Favorites

Comments

Relevant content