ChatGPT proved a six-year-old problem; Turing Award winner says: "It's too early to celebrate."

avatar
36kr
06-08
This article is machine translated
Show original

One of the most scathing criticisms in academia is:

This work is both innovative and excellent.

Unfortunately, the good parts are not novel, and the novel parts are not good.

But Richard Sutton, one of the founders of reinforcement learning, author of the textbook "Reinforcement Learning," and Turing Award winner, directed this joke at the entire field of generative AI.

He said: This assessment applies to most of the AI we are familiar with today.

AI: The good parts aren't novel, and the novel parts aren't good.

Sutton's core argument is extremely concise, so concise it's almost brutal.

Generative AI is essentially supervised learning.

The logic of supervised learning is to show the model a large number of human-created samples so that it can learn to imitate them.

The more closely the imitation, the higher the score.

Here's the problem.

When a model generates content strictly according to the training data, the output quality is high because it reproduces things that humans have already validated. But this is not novel. It is simply repackaging things that humans already know in different permutations and combinations.

The quality drops when the model tries to deviate from the training data and generate truly novel content. This is because it lacks any internal mechanism to judge whether "this new thing is good or bad." It only generates, it doesn't evaluate.

This is the structural contradiction:

Novelty and quality, within the framework of purely supervised learning, are two ends of a seesaw.

If you press down one end, the other end will pop up.

It's not an engineering problem. It can't be solved by simply adding more data, scaling up the model, or adding more GPUs.

Sutton used a jarring analogy: "illusion"—the most criticized flaw of large models—is essentially a byproduct of the model's attempt to be "novel."

Our aversion to illusions proves one thing: we don't actually need novelty. We just want high-quality imitation.

"Good things aren't original, and original things aren't good."

The reviewer's scathing comment in that joke actually accurately described the inherent limitations of generative AI.

True "discovery" requires three pieces of equipment.

Starting from first principles, Sutton deconstructed the "trinity formula" of creativity:

True discovery = variation + evaluation + selective retention.

Any genuine creativity and discovery requires three steps, none of which can be omitted:

1. Variation produces the possibility of diversity. It can be random, or it can be based on existing knowledge, but there must be genuine uncertainty—otherwise it's not exploration, it's just looking up a table.

2. Evaluation: Determining which variations are valuable. This requires a clear objective or a standard that can identify "good" and "bad".

3. Selective retention preserves valuable variations so that they can influence future actions and learning.

These three steps are not Sutton's invention. They are the logic of natural selection, the logic of the scientific method, and the logic of human learning.

Evolutionary theory: random gene mutation (variation) → environmental selection (evaluation) → survival of the fittest (selective retention).

Scientific method: Hypothesis formulation (variation) → Experimental verification (evaluation) → Publication of papers (selective retention).

Human learning involves trying different solutions (mutation) → verifying correctness (evaluation) → memorizing effective methods (selective retention).

Currently, generative AI has only completed the first step of the three-pronged approach: there is almost no evaluation, let alone selective retention.

It's like an archer who shoots randomly, but with their eyes covered, neither looking at the target nor adjusting their posture after shooting.

You could ask it to shoot ten thousand arrows, and it might hit the target occasionally, but it would never know why it did.

So, are scientists still of any use?

At this point, you might be a little anxious: if AI can really autonomously complete the three-in-one process of "discovery" in the future, will scientists lose their jobs?

Sutton's own answer was: he cannot be replaced, but his role needs a complete transformation.

In his speech, he said that even AI that can independently prove mathematical theorems still needs humans to tell it which problems are important.

This is not modesty, but a true reflection of the boundaries of our understanding.

Mathematician Shiqian Ma, an optimization scholar at Rice University, said he used ChatGPT to prove an algorithm convergence problem that he had been studying for six years.

There is a sentence in the abstract:

The proof was generated by ChatGPT 5.5 and verified by the author.

https://optimization-online.org/2026/05/convergence-of-bdrs-as-a-matrix-scaling-algorithm/

This algorithm is called BDRS, short for Bregman-Douglas-Rachford Splitting, and is used to solve the optimal transport problem.

Paper Title: Bregman Douglas-Rachford Splitting Method

Preprint available at: https://arxiv.org/abs/2509.08739

This is something he and his co-authors designed themselves. What has troubled him for six years is its convergence proof, that is, the most rigorous mathematical explanation of "why it is correct".

The preprint platform arXiv has been holding the manuscript in limbo since it was received.

He speculated that the reason was that the abstract contained the words "ChatGPT," and the platform did not know how to handle such papers.

But can humans be replaced by AI?

His answer was: No. He stated bluntly:

I don't think AI can creatively come up with such an algorithm and claim, "This is an efficient algorithm for optimal transmission; now let me try to prove its convergence."

Without human guidance, AI has no idea which problem to solve.

This statement corresponds precisely to Sutton's: the problem itself must be defined by humans.

It took him six years to "ask the right questions":

To ask the right questions, you actually need to have a very deep understanding of the topic.

In this case, I have been studying this problem for six years, so I know exactly where the difficulties lie.

These six years were not a waste; they were a prerequisite.

It was during these six years that he learned where the proof was stuck, why all the previous paths had failed, which direction ChatGPT suggested was worth pursuing, and which was an illusion.

And it wasn't just one reminder, it took five months. This is the most easily misunderstood part; he himself misunderstood it before:

From January to May, a full five months, countless conversations, and every hint brought us closer to that proof.

His summary was extremely clear:

The essence of research hasn't changed; it's still about trial and error. What has changed is the speed of each trial and error —in the past, it took weeks to validate a direction, but now it only takes minutes to know whether a path is viable.

But AI's contributions are undeniable:

Then, the ending directly elevates it to godhood:

Returning to my paper on the convergence of BDRS, I am fairly certain that the proof is correct.

But if you find any errors, I will take full responsibility— please don't blame ChatGPT, it's only 3.5 years old.

The brilliance of this statement lies in its duality: it is both a sincere declaration of responsibility and a precise metaphor.

"3.5 years old" describes the current reality of AI: amazing capabilities, but immature judgment.

After all, humans have never expected a 3.5-year-old child to make any contribution.

While you can't hand over the final signing authority of the proof to AI, you also can't pretend that AI didn't make any contribution.

This is why true scientific discoveries do not disappear in human hands.

Instead, it will be more ruthless in its selection of humans: only those who can ask good questions deserve to have strong AI.

In the future, scientists may become as obsolete as astronomers without computers, just as they would be without AI.

Finally, let's revisit Sutton's rather manifesto-like words:

If we want to fully leverage the power of AI scientists, we should share goals with them, enabling them to create, evaluate, and discover, thus fully participating in achieving those goals.

Let's be bold! Let's fully automate creativity and discovery!

References:

https://x.com/RichardSSutton/status/2061216087744946656

https://optimization-online.org/2026/05/convergence-of-bdrs-as-a-matrix-scaling-algorithm/

This article is from the WeChat public account "New Zhiyuan" , author: ASI Revelation, editor: David, and published with authorization from 36Kr.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
79
Add to Favorites
19
Comments