GPT-5 solved the century-old problem, but it was copied from the Internet. Hassabis: It's so embarrassing

This article is machine translated
Show original

The GPT-5 fiasco has left OpenAI in a state of disbelief! Everyone thought GPT-5 had cracked ten Erdos puzzles, only to discover they had found the answers by looking up the literature. Hassabis commented: "This is incredibly embarrassing."

The OpenAI team hyped up GPT-5, but it turned out to be a farce...

Here's the thing.

A few days ago, OpenAI scientist Sebastien Bubeck excitedly forwarded a post about two researchers teaming up with GPT-5 Pro to solve 10 "century-old unsolved cases" in just one weekend.

Erdos problems

Soon after, OpenAI Vice President of Science Kevin Weil and others joined in and promoted the project.

However, the truth soon emerged:

These ten difficult problems have long been solved by academia and were not solved independently by GPT-5. It simply gave the answers by searching online literature.

The news caused an uproar, with even Google DeepMind CEO Demis Hassabis commenting, "This is too awkward."

In addition, Turing Award winner LeCun flashed on X and mocked, "It's simply like shooting itself in the foot (GPT)."

GPT-5, a farce

This farce can be said to be directed and acted by the OpenAI team.

Researchers Mark Sellke and Mehtaab Sawhney made it clear in their paper that they did not claim that GPT-5 had solved the problem.

Their original post said that after running thousands of queries through GPT-5, they found solutions to ten problems listed as Erdos problems.

The result at that time was that issues 223, 339, 494, 515, 621, 822, 883, 903, 1043, and 1079 were all resolved, and some progress was made on another 11 issues.

On the other hand, the answers to these ten difficult questions already exist, but the website administrator has not updated them.

Portal: https://www.erdosproblems.com/

Thomas Bloom, a researcher at the Royal Society and operator of the website erdosproblems.com, was unaware of this.

On the website, the "open" status only means that he personally does not know the solution to the problem, not that the problem has not yet been solved by the scientific community.

In short, the two misunderstandings came together to give people the "illusion" that GPT-5 had solved the Erdős problem.

On the one hand, the problem is not unsolved, but the website maintainers are not aware of it; on the other hand, GPT-5 only completed it by searching literature, not solving it by itself.

Sebastien Bubeck replied awkwardly that GPT-5 simply found a solution in the literature, that's all.

Even so, it was very efficient because I know how difficult it is to search for literature.

Hotly discussed by netizens, peer review is still needed

The big guys in the comment section are sitting in the front row and enjoying the show.

Through this debate, developer Matt Mazur has made it clear that anyone must be cautious about any claims that AI has discovered new scientific or mathematical results.

Yuchen Jin, founder of Hyperbolic, said, "There needs to be more peer review on new discoveries in science/mathematics made by AI."

However, some people believe that this is not embarrassing for GPT-5. After all, it performs very well in literature retrieval.

A few days ago, Terence Tao also wrote an article saying, "I increasingly feel that if AI is to truly play a role in mathematics, the key may not be to use the most powerful models to solve the most difficult problems."

Of course, there are occasional cases like this, especially when people invest a lot of computing power and expert effort into it.

But a more reliable approach is to use medium-level AI tools to help us handle the trivial and physical work that is inevitable in research.

At all times, being cautious about original AI discoveries is the first priority, but this does not prevent AI-assisted scientific research from becoming a necessary path for the future.

References:

https://x.com/SebastienBubeck/status/1979539604522127746

https://x.com/thomasfbloom/status/1979254235075059732

This article comes from the WeChat public account "Xinzhiyuan" , author: Taozi, and is authorized to be published by 36Kr.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments