GPT-5 overcomes the "quantum NP problem", the first paper sets off the academic community, compressing the human 2 weeks to 30 minutes

This article is machine translated
Show original

GPT-5 is rewriting the rules of scientific discovery! A major paper reveals that GPT-5 solved a quantum NP-hard problem in just 30 minutes, a feat that would take humans 1-2 weeks. At this rate of progress, AI is not far from achieving Nobel Prize-winning breakthroughs.

A few days ago, GPT-5 successfully passed the "Gödel test" and cracked three major mathematical conjectures.

Unexpectedly, this time, GPT-5 has "conquered" the difficult problems in the quantum field again.

Quantum computing expert Scott Aaronson published a paper for the first time, proving that one of the old problems was solved with the help of GPT-5.

In the paper, Scott has been working hard on a core problem in quantum computing - the QMA complexity category, which can be called the "quantum version of the NP problem."

The key lies in whether the error probability in the proof process can be infinitely reduced, and in particular, whether perfect completeness can be achieved.

Paper address: https://arxiv.org/pdf/2509.21131

Previous academic research has already reduced the error to a very low level, but the latest research has found that "double exponential error" is the theoretical limit of existing methods and cannot be further improved.

After encountering obstacles in the key deduction link, the author began to seek help from GPT-5. At the beginning, AI gave wrong ideas.

But after about 30 minutes of interaction, it finally came up with a clever mathematical function that precisely analyzed the eigenvalue behavior.

Research has shown that this idea became the most critical breakthrough in the paper.

In his latest blog post, Scott expressed his amazement, “If any student came up with this idea, I would definitely praise him – it’s absolutely amazing!”

This problem is estimated to take 1-2 weeks of manpower to complete

OpenAI scientist Sebastien and product manager Kevin excitedly retweeted the post again, saying that "a major change has begun."

The quantum version of NP-problems: QMA singularity

This paper, submitted to arXiv on the 25th, mainly studies the limitations of black box amplification in quantum complexity class "QMA".

So, what is QMA?

QMA, or Quantum Merlin Arthur, can be seen as a typical quantum version of NP.

It involves a class of decision problems:

If the answer is "yes", Merlin can send Arthur a quantum witness state that Arthur can accept with a probability of at least 2/3 (after polynomial time quantum computation);

If the answer is "no", no matter what witness state Merlin sends, the probability that Arthur accepts it is at most 1/3.

Here, as is common in complexity theory, the constants 2/3 and 1/3 are just conventions and can be replaced by scaling up to, say, 1-2⁻ⁿ and 2⁻ⁿ.

A long-standing question in this field is:

Is QMA equivalent to QMA₁, where QMA₁ is a subclass of QMA that allows protocols to be "perfectly complete"?

In 2008, Scott Aaronson used practical analytical methods to prove the existence of a "quantum oracle" such that QMA≠QMA₁.

This means that any attempt to prove that QMA=QMA₁ requires "quantum non-relativization technology."

This does not mean that the obstacle is insurmountable, but it does illustrate the complexity of the problem.

Breakthrough: Double exponential amplification limitations

It wasn't until June this year that Freek Witteveen and Stacey Jeffery published a major paper proving that the QMA protocol can be amplified through a black box approach, making the completeness error reach "double exponentially small", that is, 1/exp(exp(n)).

Paper address: https://arxiv.org/pdf/2506.15551

They took a approach that Scott had never considered: encoding the acceptance probability in the amplitude of a quantum state, and these amplitudes decrease in geometric series.

Facts have proved that QMA, an "old friend" that we have known for 25 years, can still bring surprises.

In an online meeting in August, Scott asked:

Is this double exponential completeness the limit of black box technology? Can it be further amplified to a triple exponential level, that is, 1/exp(exp(exp(n))).

30 minutes to conquer GPT-5 and get a high score

A week later, Scott and Freek teamed up to write a complete proof, showing that under black box technology, the double exponentially small completeness error is the limit.

In other words, they quantified the 2008 “QMA≠QMA₁” oracle separation result, and the resulting “lower bound” exactly matched the agreement of the June paper.

Perhaps the most compelling part of this research isn’t quantum complexity itself, but the role of AI in it.

As mentioned earlier, this is Scott Aaronson's first paper in which a key technical step in the proof of its main results comes from AI.

Specifically, it is GPT5-Thinking.

At that time, the problem faced by the author was: analyzing an N×N Hermitian matrix E(θ) (for example, N=2ⁿ), each element of which is a poly(n)-degree trigonometric polynomial with respect to the real parameter θ.

What needs to be proven is the maximum eigenvalue of E(θ) when θ varies from 0 to 1, to prove that λₘₐₓ(E(θ)) cannot start from a value close to 0 and then "stay" close to 1 for a long time, for example close to 1/exp(exp(exp(n))).

If Scott and his co-authors have 1-2 weeks to review the literature, they can solve this problem.

But he chose GPT5-Thinking, and five minutes later, it gave a confident but obviously wrong answer.

Scott didn't laugh at the AI, but told it where it was wrong. After thinking for a moment, GPT5-Thinking tried again and came up with a better solution.

And so, after several iterations, just like a graduate student/colleague exchange, GPT-5 came up with the following function:

It correctly points out that this is a rational function of controllable degree in θ, and it happens to encode information about how close the largest eigenvalue λₘₐₓ(E(θ)) is to 1.

Fortunately, this method worked and verification could be easily completed without the help of AI.

Scott believes that perhaps GPT5 has seen a similar structure somewhere in the training data, but if it is a solution proposed by a student, he will not hesitate to call it "clever."

Finally, he recalled that a year ago, he had tried to solve a similar problem with the GPT reasoning model at the time, and the results were far from satisfactory.

Now, it is September 2025, and I can tell you clearly -

AI has started to really get to the core of what I consider to be the most characteristic of human intelligence: proving oracle separation between quantum complexity classes.

It’s not yet capable of writing an entire research paper on its own, but if you know what you’re doing and it can help you get out of trouble, it’s a great application scenario.

Who knows how long this situation will last?

Scott Aaronson joked, "Thinking of this, I can't help but feel fortunate that I still have a stable job - a tenured position."

References:

https://scottaaronson.blog/?p=9183

https://x.com/SebastienBubeck/status/1972368891239375078

https://x.com/kimmonismus/status/1972399015825203463

This article comes from the WeChat public account "Xinzhiyuan" , author: Xinzhiyuan, editor: Taozi, and is authorized to be published by 36Kr.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments