OpenAI today announced an improved version of its most capable artificial intelligence model to date—one that takes even more time to deliberate over questions—just a day after Google announced its fi...

<a href="https://www.wired.com/tag/openai/" rel="nofollow">OpenAI</a> today announced an improved version of its most capable <a href="https://www.wired.com/tag/artificial-intelligence/" rel="nofollow">artificial intelligence</a> model to date—one that takes even more time to deliberate over questions—just a day after <a href="https://www.wired.com/tag/google/" rel="nofollow">Google</a> announced its first model of this type.OpenAI’s new model, called o3, replaces o1, which the company <a href="https://www.wired.com/story/openai-o1-strawberry-problem-reasoning/%5D" rel="nofollow">introduced in September</a>. Like o1, the new model spends time ruminating over a problem in order to deliver better answers to questions that require step-by-step logical reasoning. (OpenAI chose to skip the “o2” moniker because it's already the name of a mobile carrier in the UK.)“We view this as the beginning of the next phase of AI,” said OpenAI CEO Sam Altman on a livestream Friday. “Where you can use these models to do increasingly complex tasks that require a lot of reasoning.”The o3 model scores much higher on several measures than its predecessor, OpenAI says, including ones that measure complex coding-related skills and advanced math and science competency. It is three times better than o1 at answering questions posed by <a href="https://arcprize.org/" rel="nofollow">ARC-AGI</a>, a benchmark designed to test an AI models’ ability to reason over extremely difficult mathematical and logic problems they’re encountering for the first time.Google is pursuing a similar line of research. Noam Shazeer, a Google researcher, yesterday <a href="https://x.com/NoamShazeer/status/1869789881637200228" rel="nofollow">revealed in a post on X</a> that the company has developed its own reasoning model, called Gemini 2.0 Flash Thinking. Google’s CEO, Sundar Pichai, called it “our most thoughtful model yet” in his <a href="https://x.com/sundarpichai/status/1869792088356991253" rel="nofollow">own post</a>.The two dueling models show competition between OpenAI and Google to be fiercer than ever. It is crucial for OpenAI to demonstrate that it can keep making advances as it seeks to attract more investment and build a profitable business. Google is meanwhile desperate to show that it remains at the forefront of AI research.The new models also show how AI companies are increasingly looking beyond simply scaling up AI models in order to wring greater intelligence out of them.OpenAI says there are two versions of the new model, o3 and o3-mini. The company is not making the models publicly available yet but says it will invite outsiders to apply to perform testing of them. OpenAI today also revealed more details of techniques used to align o1. This involves having the model reason about the nature of the request it is given to interrogate whether it may contravene its guardrails.Large language models can answer many questions remarkably well, but they often stumble when asked to solve puzzles that require basic math or logic. OpenAI’s o1 incorporates training on step-by-step problem-solving that makes an AI model better able to tackle these types of problems.Models that reason over problems will also be important as companies seek to deploy so-called AI agents that can reliably figure out how to solve complex problems on a users’ behalf. The o3 model is 20 percent better than o1 at a SWE-Bench, a test that measures a models’ agentic abilities.“This really signifies that we are really climbing the frontier of utility,” Mark Chen, senior vice president of research at OpenAI said on today’s livestream.“This model is incredible at programming,” Atlman added.While a true breakthrough moment has eluded tech giants at the end of the year, the pace of AI announcements has been dizzying of late.Early this month <a href="https://www.wired.com/story/google-gemini-2-ai-assistant-release/" rel="nofollow">Google announced</a> a new version of its flagship model, called Gemini 2.0, and demonstrated it as a web browsing helper and as an assistant that sees the world through a smartphone or a pair of smart glasses.OpenAI has made numerous announcements in the run up to Christmas, including a new version of its video-generating model, a free version of its ChatGPT-powered search engine, and a way to access ChatGPT over the phone by calling <a href="https://help.openai.com/en/articles/10193193-1-800-chatgpt-calling-and-messaging-chatgpt-with-your-phone" rel="nofollow">1-800-ChatGPT</a>.Update 12/20/24 1:16pm ET: This story has been updated with further comment and detail from OpenAI.

OpenAI Upgrades Its Smartest AI Model With Improved Reasoning Skills

OpenAI今天宣布了其迄今最强大的人工智能模型的改进版本——这个模型在回答问题时需要更多时间进行深思熟虑,这是在谷歌宣布其fi...的一天之后。

<a href="https://www.wired.com/tag/openai/">OpenAI</a>今天宣布了其迄今最强大的<a href="https://www.wired.com/tag/artificial-intelligence/">人工智能</a>模型的改进版本 - 一个在回答问题时需要更多时间进行推理的模型,这是在<a href="https://www.wired.com/tag/google/">谷歌</a>宣布了其首个此类模型的一天之后。OpenAI的新模型称为o3,取代了此前推出的o1模型。与o1一样,新模型也会花时间思考问题,以提供更好的需要逐步逻辑推理的答案。(OpenAI选择跳过"o2"这个名称,因为它已经是英国一家移动运营商的名称。)"我们认为这是AI下一阶段的开始,"OpenAI首席执行官Sam Altman在周五的直播中说。"你可以使用这些模型来完成需要大量推理的越来越复杂的任务。"OpenAI表示,o3模型在几个指标上的得分都远高于其前身,包括测量复杂编码相关技能和高级数学和科学能力的指标。它在<a href="https://arcprize.org/">ARC-AGI</a>基准测试中的得分是o1的3倍,该基准旨在测试AI模型解决极其困难的数学和逻辑问题的能力。谷歌也在进行类似的研究。谷歌研究员Noam Shazeer昨天在X上发帖透露,该公司开发了自己的推理模型Gemini 2.0 Flash Thinking。谷歌首席执行官Sundar Pichai在他自己的帖子中称其为"我们迄今为止最周到的模型"。这两个对抗的模型显示,OpenAI和谷歌之间的竞争比以往任何时候都更加激烈。对于OpenAI来说,展示它能够不断取得进步至关重要,因为这将有助于吸引更多投资并建立一个盈利的业务。与此同时,谷歌也迫切需要证明自己仍然处于人工智能研究的前沿。这些新模型也表明,人工智能公司正越来越多地寻求超越简单地扩大AI模型规模,以从中挖掘出更高的智能。OpenAI表示,新模型有o3和o3-mini两个版本。该公司目前还没有公开这些模型,但表示将邀请外部人员进行测试。OpenAI今天还透露了更多关于对o1进行对齐的技术细节。这涉及让模型思考请求的性质,以检查是否可能违反其防护措施。大型语言模型可以很好地回答许多问题,但当被要求解决需要基本数学或逻辑的难题时,它们通常会失败。OpenAI的o1通过对逐步问题解决的训练,使AI模型能够更好地处理这类问题。能够推理问题的模型在公司寻求部署所谓的AI代理以可靠地解决复杂问题的过程中也将很重要。o3模型在SWE-Bench测试中的得分比o1高20%,该测试衡量模型的代理能力。"这真的标志着我们正在不断攀登效用的前沿,"OpenAI研究高级副总裁Mark Chen在今天的直播中说。"这个模型在编程方面非常出色,"Atlman补充道。尽管科技巨头在年底还没有取得真正的突破性时刻,但人工智能的发布速度一直令人眩晕。本月初,<a href="https://www.wired.com/story/google-gemini-2-ai-assistant-release/">谷歌宣布</a>了其旗舰模型Gemini 2.0的新版本,并展示了它作为网页浏览助手和通过智能手机或智能眼镜观察世界的助手的功能。OpenAI在圣诞节前夕也做出了多项公告,包括其视频生成模型的新版本、一个基于ChatGPT的免费搜索引擎,以及一种通过拨打<a href="https://help.openai.com/en/articles/10193193-1-800-chatgpt-calling-and-messaging-chatgpt-with-your-phone">1-800-ChatGPT</a>访问ChatGPT的方式。更新于2024年12月20日下午1:16:本文已更新,增加了来自OpenAI的更多评论和细节。

OpenAI 升级其最智能的 AI 模型，提升推理能力

原文作者：何浩，华尔街见闻
周四，美股大跌，纳指跌超 2%，部分交易员抛售贵金属以弥补股票市场的亏损，黄金、白银、铜、铂金、钯金大幅下挫。美元指数小幅上涨。
在外界再次担忧巨额人工智能投资能否真正大规模落地之际，美国科技股走低。金属价格在疑似算法交易抛售下突然下跌，一些投资者不得不退出包括金属在内的大宗商品仓位以获取流动性，也有部分资金转向美国国债避险。
现货黄金一度下跌 4.1%，白银暴跌 11...

黄金一度跌超4%、白银暴跌11%，美股大跌引爆算法交易贵金属卖盘？

美国领先的加密货币交易所Coinbase 刚刚公布了其 2025 年第四季度财报，财报显示，背景加密市场在年底大幅下滑，但该公司仍做出了重大调整。具体情况如下……

Coinbase报告的财报表现平平，导致其股价下挫。

世界自由金融公司（World Liberty Financial）的目标是在全球数兆美元的外汇市场——全球最大、流动性最强的金融领域——分一杯羹。

这家与川普政府有关的加密货币公司…