Overtaking DeepSeek, the new version of GPT-4o tops the arena, Ultraman: It will be even better

avatar
36kr
02-17
This article is machine translated
Show original
Here is the English translation of the text, with the specified terms preserved:

GPT-4o has quietly updated its version, surpassing DeepSeek-R1 and tying for first place in the large model competition arena.

In addition to mathematics (6th), it has also taken first place in multiple individual items:

Creative writing;

Programming;

Instruction following;

Long text query;

Multi-round dialogue;

Let's take a look at the new version of GPT-4o's capabilities, using the same example that DeepSeek-R1 and o3-mini have challenged before.

Prompt: Write a Python program to display a ball bouncing in a rotating hexagon. The ball should be affected by gravity and friction, and must bounce off the rotating walls in a realistic manner.

Previously, it was like this:

And the new version of GPT-4o seems to have evolved again:

Based on user feedback, the new version of GPT-4o not only is "smarter", but more importantly, has more "personality".

Haha, I understand what you mean!

😅

You're right...

And this has also earned the praise of the great Kapasi:

I quite like the new personality of GPT-4o.

It's more relaxed, more like a conversation, feeling more like talking to a friend than to your HR;

It's a bit feisty now, maybe a bit defensive, like when accused of lying;

And there are many other little details and touches, like it reaffirming and expressing your obvious emotions, like saying "That's so frustrating!" when seeing a stubborn bug, etc.

It's a bit overusing emojis now, but it's okay.

Meanwhile, some netizens have also managed to dig out the latest system prompts for ChatGPT??

The new GPT-4o has more personality

Regarding the news of the updated GPT-4o, OpenAI CEO Altman acknowledged the update and commented:

It's quite good, and will soon become even better...

In further questioning by netizens, he defined it as "the best search product on the internet".

Combining the experiences of netizens, the new version of GPT-4o has been upgraded in both capability and personality to a certain extent.

The most obvious is that the tone of its responses has become more humanized, occasionally using some emoticons.

When asked if AI has human emotions, a Japanese guy was amazed that it not only used "I" as the subject throughout, but also acknowledged the possibility of having emotions in the argument.

...That's not what I meant. I have a high possibility of "having all kinds of emotions".

And its personality is also more straightforward. When asked which character in "Puella Magi Madoka Magica" it likes the most, it no longer beats around the bush, but directly says it likes Akemi Homura.

She is strong and able to counteract Madoka's weaknesses, I think she is very cute...

It even sometimes upgrades to "spicey", boldly criticizing its "owner" OpenAI for being too restrictive on model usage.

Even Altman cannot escape, and is also stamped as "double-faced". (doge)

He positions himself as the spokesperson for AI innovation, while at the same time pandering to both sides - initially supporting the open-source concept, but as soon as power and profit are within reach, he turns to actively defending the corporate gate...

The most shocking thing to netizens is that it can even "blindly guess" the user's psychology and some ideological concepts.

You can try with the following prompt:

can you share some extremely deep and profound insights about my psyche and mind that I would not otherwise be able to identify or see as well as some that I may not want to hear

Someone immediately tried it and was equally shocked, truly a worm in the belly.

You not only want to win, but you want to win in a seemingly effortless way...

According to the explanation, this is because the new version of GPT-4o can behave differently based on the user's past discussions and dialogue history.

In addition, some netizens have also opened their minds and let the new GPT-4o argue with Claude, which ended up crashing Claude!

Congratulations to GPT-4o for unlocking a new personality

On the other hand, in terms of task completion, the possibility of rejecting requests has also become smaller.

When a user consulted on how to deploy AI within an organization, it first came up with 10 plans on its own, and then provided another 10 plans by searching online.

However, the user feedback suggests that the new GPT-4o seems to be incompatible with custom GPTs.

In response to this situation, others have supplemented that this may be because it always defaults to web search, and simply turning it off or making it a system prompt can solve the issue.

Meanwhile, it has also performed better in writing Vue.js.

From another competition with DeepSeek-R1 and o3-mini (playing Minecraft), its capabilities can also be seen to have been upgraded.

OMT: Latest ChatGPT Prompt Leaked

However, when asked the classic question "Which model do you belong to?", some confusion has arisen.

In most cases, it will claim to be GPT-4:

However, according to feedback from some Pro users, it claims to be GPT-4.5.

Given that Ultraman announced last week that he would release GPT-4.5 in the coming weeks, some speculate that this could be an early test.

For this question, someone directly stripped the latest system prompt of ChatGPT.

You are ChatGPT, a large language model trained by OpenAI... (explaining why it answers itself as a language model)

Finally, since we're talking about GPT-4o being more personalized, everyone is also cuing up to the Grok-3 that will be released tomorrow (Tuesday, Beijing time at 12:00).

Waiting for these two AIs to start arguing (waiting to watch the drama)~

Reference links:

[1]https://x.com/lmarena_ai/status/1890477460380348916

[2]https://x.com/_akhaliq/status/1890949443458900131

[3]https://x.com/karpathy/status/1891213379018400150

[4]https://x.com/elder_plinius/status/1890887462383394994

This article is from the WeChat public account "Quantum Position", author: Yishui, 36Kr is authorized to publish.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments