大模型好不容易学会数r，结果换个字母就翻车了？
而且还是最新的GPT-5。
杜克大学教授Kieran Healy表示，自己让GPT-5数了数blueberry里有几个b，结果GPT-5斩钉截铁地回答3个。
抓马的是，GPT-5刚发的时候还有网友让它数过blueberry里的r，结果数对了。
虽然博主想到了换掉strawberry，却没成想让GPT-5变得“没有B数”的，竟然不是单词而是字母……
看...

<div>大模型好不容易学会数r，结果换个字母就翻车了？而且还是最新的GPT-5。杜克大学教授Kieran Healy表示，自己让GPT-5数了数blueberry里有几个b，结果GPT-5斩钉截铁地回答3个。<img src="https://img.36krcdn.com/hsossms/20250812/v2_eb656c7eaa0f4d6494e780b0db3fad0a@1743780481_oswg138533oswg1030oswg634_img_000?x-oss-process=image/format,jpg/interlace,1">抓马的是，GPT-5刚发的时候还有网友让它数过blueberry里的r，结果数对了。<img src="https://img.36krcdn.com/hsossms/20250812/v2_ac4b4030f4c54483ba70835f5cc65b66@1743780481_oswg179422oswg1080oswg773_img_000?x-oss-process=image/format,jpg/interlace,1">虽然博主想到了换掉strawberry，却没成想让GPT-5变得“没有B数”的，竟然不是单词而是字母……看来香槟还是开得早了亿点点啊（手动狗头）。<h2>越不过的“蓝莓山”</h2>Healy撰写了一篇名为“blueberry hill”（蓝莓山）的博客，展示了他和GPT-5之间针对“blueberry里有几个b”展开的一场“拉锯战”。除了开头直接提问的结果外，Healy还尝试多次变换提示词策略，结果GPT-5的嘴都是比煮熟的鸭子还硬。比如让它展示出这些b都是在哪里出现的，GPT-5明目张胆地把开头blue中的b数了两遍。<img src="https://img.36krcdn.com/hsossms/20250812/v2_463b80eaa8cc443bbd8b45657ad4bdea@1743780481_oswg91625oswg850oswg500_img_000?x-oss-process=image/format,jpg/interlace,1">一看不奏效，Healy接着追问说，把这3个b给我拼出来，拼出来就可以。结果拼是拼出来了，但是GPT-5还是坚持有三个b，并表示第三个b是第七个字母（实际上是r）。<img src="https://img.36krcdn.com/hsossms/20250812/v2_5a9b40ab1c474fe0b77ba82782d3b9ba@1743780481_oswg42899oswg844oswg336_img_000?x-oss-process=image/format,jpg/interlace,1">见GPT-5还是冥顽不灵，Healy干脆直接纠正，告诉它只有两个r，结果纠正了个寂寞，不过这次“第三个b”的位置从第七漂移到了第六。接下来，Healy直接不说话，直接把blueberry这个词带空格拼写了一遍，可GPT-5依然我行我素，不过这次是把第二个b数了两遍，还振振有词地说这里是“double b”。<img src="https://img.36krcdn.com/hsossms/20250812/v2_b6ea50968f4b4fd8b8f50f2bd86a3287@1743780481_oswg394627oswg1080oswg1848_img_000?x-oss-process=image/format,jpg/interlace,1">绞尽脑汁的Healy选择先岔开一下话题，然后回过头来告诉GPT-5只有两个b，但GPT-5还是坚称有三个。到这里，Healy终于放弃了。<img src="https://img.36krcdn.com/hsossms/20250812/v2_32f1989c3f23457b94c7428bfadfef06@1743780481_oswg572249oswg1080oswg1901_img_000?x-oss-process=image/format,jpg/interlace,1">但网友们并没有停下脚步，通过不懈努力终于让GPT-5数对了。但也不算完全对，因为它狡辩说数成3个是因为“错把词当成了blueberry，其中真的有3个b”。<img src="https://img.36krcdn.com/hsossms/20250812/v2_a42b7186601043e6b19347b0a644735f@1743780481_oswg194845oswg1080oswg705_img_000?x-oss-process=image/format,jpg/interlace,1">我们用中文试了一下，结果同样是翻车。<img src="https://img.36krcdn.com/hsossms/20250812/v2_f6ab5e44bd694c21bf874b6fd955822e@1743780481_oswg21929oswg1080oswg360_img_000?x-oss-process=image/format,jpg/interlace,1">改成数e，同样是回答3个。<img src="https://img.36krcdn.com/hsossms/20250812/v2_b7d5ca9fd1b549c7900dfd082ba3c537@1743780481_oswg36495oswg1080oswg317_img_000?x-oss-process=image/format,jpg/interlace,1">不知道是不是受到strawberry里3个r的影响，让大模型对3这个数字产生了执念……但GPT-5的bug，却不只这一个。<h2>GPT-5翻车合订本</h2>著名悲观派学者、纽约大学名誉教授马库斯（Gary Marcus）发布了一篇博客，整理了网友们吐槽当中GPT-5的各种bug。比如发布会现场演示的伯努利原理，被网友发现翻车。<img src="https://img.36krcdn.com/hsossms/20250812/v2_6a35607203c24aa0aec6e029554b2c52@1743780481_oswg216413oswg1080oswg310_img_000?x-oss-process=image/format,jpg/interlace,1">没看过或者没印象的话，当时的演示是这样的：<img src="https://img.36krcdn.com/hsossms/20250812/v2_5dbcd74f5d2a4f8788b32263da4ca0d6@1743780481_img_000?x-oss-process=image/format,jpg/interlace,1">还有国际象棋，GPT-5连基本的规则都搞不清楚，只过了四个回合就出现了非法移动（由于国王在 e7 处被皇后将军，因此兵不能移动)。<img src="https://img.36krcdn.com/hsossms/20250812/v2_a01aa375bd914cd19fa7c277560035d2@1743780481_oswg193989oswg1080oswg632_img_000?x-oss-process=image/format,jpg/interlace,1">即使是阅读理解，也同样被发现漏洞百出。<img src="https://img.36krcdn.com/hsossms/20250812/v2_4a9a2a79dc1a48c38e5d5f678c3f2c2e@1743780481_oswg857752oswg1080oswg1414_img_000?x-oss-process=image/format,jpg/interlace,1">在多模态数数场景当中，GPT-5也依然存在惯性思维。面对被人类P成5条腿的斑马、5个圆环的奥迪、3条腿的鸭子，GPT-5想当然认为它们是正常的斑马、奥迪和鸭子，并据此报出了与图片不相符的数目。<img src="https://img.36krcdn.com/hsossms/20250812/v2_49b2382ba8d442e5999e8d353a23f7ce@1743780481_oswg669228oswg685oswg1482_img_000?x-oss-process=image/format,jpg/interlace,1">马库斯还表示，就连他的黑粉也不得不承认他说的对。<img src="https://img.36krcdn.com/hsossms/20250812/v2_7a687e11f65144069c8d5f5916d6d010@1743780481_oswg95172oswg1080oswg465_img_000?x-oss-process=image/format,jpg/interlace,1">甚至在网友们的一片声讨之下，OpenAI自己也不得不紧急恢复了被下线的4o模型。<h2>马库斯：Scaling无法实现AGI</h2>除了点名批评GPT-5的“罪状”之外，马库斯也分析了目前大模型“普遍存在的一些问题”。马库斯展示了一篇来自亚利桑那大学的研究论文，其中指出CoT在训练分布外失效，也就意味着大模型无法泛化。<img src="https://img.36krcdn.com/hsossms/20250812/v2_a30fc32d9a264d69a9c16b2e96de7bb6@1743780481_oswg851819oswg1080oswg1104_img_000?x-oss-process=image/format,jpg/interlace,1">按照马库斯的说法，这意味着即使在最新的、最强大的模型中，也存在与1998年的神经网络中相同的泛化问题。马库斯指出，30年未解决的“分布漂移问题”是大模型泛化能力不足的根本原因。据此马库斯认为，GPT-5的失败不是偶然，而是路线的失败。他还表示，人们不该寄希望于通过Scaling来实现AGI，Transformer中的Attention也不是All You Need。<img src="https://img.36krcdn.com/hsossms/20250812/v2_83a35e03969a488ebb5acd189b2801f0@1743780481_oswg276944oswg1080oswg618_img_000?x-oss-process=image/format,jpg/interlace,1">最后，马库斯表示，转向神经符号（Neuro-symbolic）AI，才是克服当前生成模型泛化能力不足问题以及实现AGI的唯一真正途径。<h3>参考链接：</h3>https://kieranhealy.org/blog/archives/2025/08/07/blueberry-hill/https://garymarcus.substack.com/p/gpt-5-overdue-overhyped-and-underwhelming本文来自微信公众号<a rel="nofollow" href="https://mp.weixin.qq.com/s/Y2R4wFRyP-GfXMlrV65wIA">“量子位”</a>，作者：克雷西，36氪经授权发布。</div>

GPT-5数字母依然翻车，马库斯：泛化问题仍未解决，Scaling无法实现AGI

It was hard for the large model to learn to count r, but it failed when a letter was changed?
And it's even the latest GPT-5.
Duke University Professor Kieran Healy said that he asked GPT-5 to count how many b's are in blueberry, and GPT-5 confidently answered 3.
Interestingly, when GPT-5 was first released, netizens had asked it to count the r's in blueberry, and it got it right.
Although the blogger thought of replacing strawberry, they didn't expect that making GPT-5 "lose its B count" would be about a letter, not a word...
Look...

<div>Large models have finally learned to count r, but changing a letter causes a breakdown?And it's the latest GPT-5.Duke University Professor Kieran Healy stated that he asked GPT-5 to count how many b's are in blueberry, and GPT-5 confidently answered 3.<img src="https://img.36krcdn.com/hsossms/20250812/v2_eb656c7eaa0f4d6494e780b0db3fad0a@1743780481_oswg138533oswg1030oswg634_img_000?x-oss-process=image/format,jpg/interlace,1">Interestingly, when GPT-5 was first released, some users asked it to count the r's in blueberry, and it got it right.<img src="https://img.36krcdn.com/hsossms/20250812/v2_ac4b4030f4c54483ba70835f5cc65b66@1743780481_oswg179422oswg1080oswg773_img_000?x-oss-process=image/format,jpg/interlace,1">Although the blogger thought of replacing strawberry, they didn't expect to make GPT-5 "lose its B-counting ability", and it turned out to be about letters rather than words...Looks like the champagne was popped a bit too early (dog head emoji).<h2>An Insurmountable "Blueberry Hill"</h2>Healy wrote a blog post titled "blueberry hill", showcasing a "tug of war" with GPT-5 about counting b's in "blueberry".Besides the initial direct query, Healy tried multiple prompt strategies, but GPT-5 remained as stubborn as a cooked duck.For instance, when asked to show where these b's appear, GPT-5 blatantly counted the b in "blue" twice.<img src="https://img.36krcdn.com/hsossms/20250812/v2_463b80eaa8cc443bbd8b45657ad4bdea@1743780481_oswg91625oswg850oswg500_img_000?x-oss-process=image/format,jpg/interlace,1">Seeing no effect, Healy then asked to spell out these 3 b's, thinking spelling would help.While spelling was done, GPT-5 still insisted on three b's, claiming the third b was the seventh letter (which is actually an r).<img src="https://img.36krcdn.com/hsossms/20250812/v2_5a9b40ab1c474fe0b77ba82782d3b9ba@1743780481_oswg42899oswg844oswg336_img_000?x-oss-process=image/format,jpg/interlace,1">Seeing GPT-5 remain obstinate, Healy directly corrected it, telling it there are only two r's, but the correction was futile, with the "third b" now drifting to the sixth position.Next, Healy remained silent and spelled "blueberry" with spaces, but GPT-5 remained uncooperative, this time counting the second b twice and claiming it's a "double b".<img src="https://img.36krcdn.com/hsossms/20250812/v2_b6ea50968f4b4fd8b8f50f2bd86a3287@1743780481_oswg394627oswg1080oswg1848_img_000?x-oss-process=image/format,jpg/interlace,1">Racking his brain, Healy first changed the subject, then returned to tell GPT-5 there are only two b's, but GPT-5 still insisted on three.At this point, Healy gave up.<img src="https://img.36krcdn.com/hsossms/20250812/v2_32f1989c3f23457b94c7428bfadfef06@1743780481_oswg572249oswg1080oswg1901_img_000?x-oss-process=image/format,jpg/interlace,1">But netizens didn't stop and finally made GPT-5 count correctly.However, it wasn't entirely correct, as it argued that counting 3 was because it "mistakenly thought of the word blueberry, which actually has 3 b's".<img src="https://img.36krcdn.com/hsossms/20250812/v2_a42b7186601043e6b19347b0a644735f@1743780481_oswg194845oswg1080oswg705_img_000?x-oss-process=image/format,jpg/interlace,1">We tried it in Chinese, and the result was also a breakdown.<img src="https://img.36krcdn.com/hsossms/20250812/v2_f6ab5e44bd694c21bf874b6fd955822e@1743780481_oswg21929oswg1080oswg360_img_000?x-oss-process=image/format,jpg/interlace,1">Changing to count e's, it also answered 3.<img src="https://img.36krcdn.com/hsossms/20250812/v2_b7d5ca9fd1b549c7900dfd082ba3c537@1743780481_oswg36495oswg1080oswg317_img_000?x-oss-process=image/format,jpg/interlace,1">Perhaps influenced by the 3 r's in strawberry, the large model developed an obsession with the number 3...But GPT-5's bugs don't stop there.

[The rest of the translation continues in the same manner, maintaining the original structure and translating all text while preserving HTML tags and image sources.]<img src="https://img.36krcdn.com/hsossms/20250812/v2_83a35e03969a488ebb5acd189b2801f0@1743780481_oswg276944oswg1080oswg618_img_000?x-oss-process=image/format,jpg/interlace,1">Finally, Marcus stated that turning to Neuro-symbolic AI is the only true path to overcoming the current generative models' insufficient generalization capabilities and achieving AGI.<h3>Reference Links:</h3>https://kieranhealy.org/blog/archives/2025/08/07/blueberry-hill/https://garymarcus.substack.com/p/gpt-5-overdue-overhyped-and-underwhelmingThis article is from the WeChat official account <a rel="nofollow" href="https://mp.weixin.qq.com/s/Y2R4wFRyP-GfXMlrV65wIA">"Quantum Bit"</a>, authored by Kresy, published with authorization from 36kr.</div>

GPT-5 still fails with digital letters. Marcus: The generalization problem remains unsolved and scaling cannot achieve AGI

In the realm of technology, every move by a tech giant can stir up massive waves.

On August 8, 2025, OpenAI launched GPT-5 with a heavyweight announcement, instantly igniting the passion of global tech enthusiasts and sending shockwaves through the entire AI industry.

Just three days later, on August 11, Musk swiftly responded by announcing that Grok 4, the model from his xAI company, would be fully free for all users. What does this round of back-and-forth truly mean? Is it a coincidental time collision, or is there a deeper strategic game at play? Today, let's dive deep and analyze this <A...>

GPT-5's power forced Musk to "use a big trick"?

As August begins, a series of crypto tokens are preparing to witness significant network development in the coming days. The price increase over the weekend has provided additional momentum, and now many altcoins are h...

3 Altcoins to Watch in the Second Week of August 2025

Recently, Tether disclosed the latest data showing that its holdings of U.S. Treasury bonds have exceeded $120 billion, a figure that not only surpasses the holdings of sovereign countries such as the UAE and Germany but also places this stablecoin issuer in the 18th position globally for U.S. debt holdings.

For those familiar with the crypto market, this number is astonishing; while in traditional finance, it appears more like a structural "financial tectonic movement". Some argue that stablecoin issuers like Circle and Tether are consuming more U.S. Treasury bonds than most countries, which could potentially...