OpenAI today announced an improved version of its most capable artificial intelligence model to date—one that takes even more time to deliberate over questions—just a day after Google announced its fi...

<a href="https://www.wired.com/tag/openai/" rel="nofollow">OpenAI</a> today announced an improved version of its most capable <a href="https://www.wired.com/tag/artificial-intelligence/" rel="nofollow">artificial intelligence</a> model to date—one that takes even more time to deliberate over questions—just a day after <a href="https://www.wired.com/tag/google/" rel="nofollow">Google</a> announced its first model of this type.OpenAI’s new model, called o3, replaces o1, which the company <a href="https://www.wired.com/story/openai-o1-strawberry-problem-reasoning/%5D" rel="nofollow">introduced in September</a>. Like o1, the new model spends time ruminating over a problem in order to deliver better answers to questions that require step-by-step logical reasoning. (OpenAI chose to skip the “o2” moniker because it's already the name of a mobile carrier in the UK.)“We view this as the beginning of the next phase of AI,” said OpenAI CEO Sam Altman on a livestream Friday. “Where you can use these models to do increasingly complex tasks that require a lot of reasoning.”The o3 model scores much higher on several measures than its predecessor, OpenAI says, including ones that measure complex coding-related skills and advanced math and science competency. It is three times better than o1 at answering questions posed by <a href="https://arcprize.org/" rel="nofollow">ARC-AGI</a>, a benchmark designed to test an AI models’ ability to reason over extremely difficult mathematical and logic problems they’re encountering for the first time.Google is pursuing a similar line of research. Noam Shazeer, a Google researcher, yesterday <a href="https://x.com/NoamShazeer/status/1869789881637200228" rel="nofollow">revealed in a post on X</a> that the company has developed its own reasoning model, called Gemini 2.0 Flash Thinking. Google’s CEO, Sundar Pichai, called it “our most thoughtful model yet” in his <a href="https://x.com/sundarpichai/status/1869792088356991253" rel="nofollow">own post</a>.The two dueling models show competition between OpenAI and Google to be fiercer than ever. It is crucial for OpenAI to demonstrate that it can keep making advances as it seeks to attract more investment and build a profitable business. Google is meanwhile desperate to show that it remains at the forefront of AI research.The new models also show how AI companies are increasingly looking beyond simply scaling up AI models in order to wring greater intelligence out of them.OpenAI says there are two versions of the new model, o3 and o3-mini. The company is not making the models publicly available yet but says it will invite outsiders to apply to perform testing of them. OpenAI today also revealed more details of techniques used to align o1. This involves having the model reason about the nature of the request it is given to interrogate whether it may contravene its guardrails.Large language models can answer many questions remarkably well, but they often stumble when asked to solve puzzles that require basic math or logic. OpenAI’s o1 incorporates training on step-by-step problem-solving that makes an AI model better able to tackle these types of problems.Models that reason over problems will also be important as companies seek to deploy so-called AI agents that can reliably figure out how to solve complex problems on a users’ behalf. The o3 model is 20 percent better than o1 at a SWE-Bench, a test that measures a models’ agentic abilities.“This really signifies that we are really climbing the frontier of utility,” Mark Chen, senior vice president of research at OpenAI said on today’s livestream.“This model is incredible at programming,” Atlman added.While a true breakthrough moment has eluded tech giants at the end of the year, the pace of AI announcements has been dizzying of late.Early this month <a href="https://www.wired.com/story/google-gemini-2-ai-assistant-release/" rel="nofollow">Google announced</a> a new version of its flagship model, called Gemini 2.0, and demonstrated it as a web browsing helper and as an assistant that sees the world through a smartphone or a pair of smart glasses.OpenAI has made numerous announcements in the run up to Christmas, including a new version of its video-generating model, a free version of its ChatGPT-powered search engine, and a way to access ChatGPT over the phone by calling <a href="https://help.openai.com/en/articles/10193193-1-800-chatgpt-calling-and-messaging-chatgpt-with-your-phone" rel="nofollow">1-800-ChatGPT</a>.Update 12/20/24 1:16pm ET: This story has been updated with further comment and detail from OpenAI.

OpenAI Upgrades Its Smartest AI Model With Improved Reasoning Skills

OpenAI今天宣佈了其迄今最強大的人工智慧模型的改進版本——這個模型在回答問題時需要更多時間進行深思熟慮,這是在谷歌宣佈其fi...的一天之後。

<a href="https://www.wired.com/tag/openai/">OpenAI</a>今天宣佈了其迄今最強大的<a href="https://www.wired.com/tag/artificial-intelligence/">人工智慧</a>模型的改進版本 - 一個在回答問題時需要更多時間進行推理的模型,這是在<a href="https://www.wired.com/tag/google/">谷歌</a>宣佈了其首個此類模型的一天之後。OpenAI的新模型稱為o3,取代了此前推出的o1模型。與o1一樣,新模型也會花時間思考問題,以提供更好的需要逐步邏輯推理的答案。(OpenAI選擇跳過"o2"這個名稱,因為它已經是英國一家移動運營商的名稱。)"我們認為這是AI下一階段的開始,"OpenAI執行長Sam Altman在週五的直播中說。"你可以使用這些模型來完成需要大量推理的越來越複雜的任務。"OpenAI表示,o3模型在幾個指標上的得分都遠高於其前身,包括測量複雜編碼相關技能和高階數學和科學能力的指標。它在<a href="https://arcprize.org/">ARC-AGI</a>基準測試中的得分是o1的3倍,該基準旨在測試AI模型解決極其困難的數學和邏輯問題的能力。谷歌也在進行類似的研究。谷歌研究員Noam Shazeer昨天在X上發帖透露,該公司開發了自己的推理模型Gemini 2.0 Flash Thinking。谷歌執行長Sundar Pichai在他自己的帖子中稱其為"我們迄今為止最周到的模型"。這兩個對抗的模型顯示,OpenAI和谷歌之間的競爭比以往任何時候都更加激烈。對於OpenAI來說,展示它能夠不斷取得進步至關重要,因為這將有助於吸引更多投資並建立一個盈利的業務。與此同時,谷歌也迫切需要證明自己仍然處於人工智慧研究的前沿。這些新模型也表明,人工智慧公司正越來越多地尋求超越簡單地擴大AI模型規模,以從中挖掘出更高的智慧。OpenAI表示,新模型有o3和o3-mini兩個版本。該公司目前還沒有公開這些模型,但表示將邀請外部人員進行測試。OpenAI今天還透露了更多關於對o1進行對齊的技術細節。這涉及讓模型思考請求的性質,以檢查是否可能違反其防護措施。大型語言模型可以很好地回答許多問題,但當被要求解決需要基本數學或邏輯的難題時,它們通常會失敗。OpenAI的o1透過對逐步問題解決的訓練,使AI模型能夠更好地處理這類問題。能夠推理問題的模型在公司尋求部署所謂的AI代理以可靠地解決複雜問題的過程中也將很重要。o3模型在SWE-Bench測試中的得分比o1高20%,該測試衡量模型的代理能力。"這真的標誌著我們正在不斷攀登效用的前沿,"OpenAI研究高階副總裁Mark Chen在今天的直播中說。"這個模型在程式設計方面非常出色,"Atlman補充道。儘管科技巨頭在年底還沒有取得真正的突破性時刻,但人工智慧的釋出速度一直令人眩暈。本月初,<a href="https://www.wired.com/story/google-gemini-2-ai-assistant-release/">谷歌宣佈</a>了其旗艦模型Gemini 2.0的新版本,並展示了它作為網頁瀏覽助手和透過智慧手機或智慧眼鏡觀察世界的助手的功能。OpenAI在聖誕節前夕也做出了多項公告,包括其影片生成模型的新版本、一個基於ChatGPT的免費搜尋引擎,以及一種透過撥打<a href="https://help.openai.com/en/articles/10193193-1-800-chatgpt-calling-and-messaging-chatgpt-with-your-phone">1-800-ChatGPT</a>訪問ChatGPT的方式。更新於2024年12月20日下午1:16:本文已更新,增加了來自OpenAI的更多評論和細節。

OpenAI 升級其最智能的 AI 模型，提升推理能力

該交易員總計 48 次預測 15 分鐘市場漲跌，1 天淨賺 8 萬美元。

1 天賺 8 萬，這位頂級玩家把 Polymarket 當提款機

撰文：馬赫，Foresight News
最成功的公鏈之一 Solana 也迎來寒冬時刻。
自 2 月 5 日市場暴跌之際，Solana 代幣 SOL 一度跌至 67 美元，創下自 2023 年 12 月以來的新低，而截至目前，SOL 最新報價為 80 美元，24 小時跌幅為 3.57%。從月線上看，SOL 自 2025 年 10 月高位已連續下跌 5 個月，期間最大跌幅超 71%。該生態最出名的...

Meme退潮，敘事趨冷：失守80美元，Solana的週期紅利結束了？

原文作者：何浩，華爾街見聞
週四，美股大跌，納指跌超 2%，部分交易員拋售貴金屬以彌補股票市場的虧損，黃金、白銀、銅、鉑金、鈀金大幅下挫。美元指數小幅上漲。
在外界再次擔憂鉅額人工智能投資能否真正大規模落地之際，美國科技股走低。金屬價格在疑似算法交易拋售下突然下跌，一些投資者不得不退出包括金屬在內的大宗商品倉位以獲取流動性，也有部分資金轉向美國國債避險。
現貨黃金一度下跌 4.1%，白銀暴跌 11...