OpenAI launches o3 model! Reasoning capabilities are further improved, paving the way for the next generation of AI

12-21

This article is machine translated

Show original

Here is the English translation of the text, with the specified terms preserved:

Table of Contents

The developer behind ChatGPT, OpenAI, has concluded a 12-day series of new product launches, with the grand finale being the introduction of the new reasoning models "o3" and "o3-mini". These AI models possess stronger reasoning capabilities, aimed at solving complex tasks that require step-by-step logical reasoning.

Model Highlights

1) Achieving SoTA Performance in Reasoning Ability

OpenAI stated that the o3 model has performed exceptionally well in various benchmark tests, including complex programming, mathematics, and scientific problems, demonstrating its strong logical reasoning capabilities.

In the "ARC-AGI" evaluation developed by the Alignment Research Center (ARC) to test the general artificial intelligence (AGI) capabilities of AI systems, o3 achieved a breakthrough score of 75.7% in some non-public tests, setting a new State of the Art (SoTA) record.

Furthermore, a high-compute configuration version of o3 achieved an even higher score of 87.5% in the same test, but may not have qualified for the ARC-AGI-Pub (publicly verifiable ARC-AGI test results) due to the resource requirements of this version exceeding the standard.

2) Multiple Version Options

OpenAI is offering two versions, o3 and o3-mini, with the latter expected to be released by the end of January 2025, and the full o3 model to be released later (without a specific timeline).

This new model utilizes OpenAI's recently introduced Adaptive Thinking Time API, providing low, medium, and high reasoning modes. This feature allows users to adjust the "thinking" time the model takes to answer questions based on their needs. As shown in the image below, the o3-mini is able to match the reasoning performance of the current o1 model, while significantly reducing the computational cost.

3) Enhanced Security

OpenAI has adopted a new "Deliberative Alignment" training method, directly teaching large language models (LLMs) to understand human-written, interpretable safety rules, and ensuring compliance with these rules before generating responses. OpenAI stated in the announcement:

Through this approach, we have successfully optimized the OpenAI o-series models to utilize Chain-of-Thought (CoT) reasoning techniques, reflecting on the user's query, identifying relevant policy texts within OpenAI, and generating safer responses.

Origin of the Name

It is worth noting that OpenAI skipped the "o2" naming and went directly to "o3". CEO Sam Altman explained that this was to avoid confusion with the British telecommunications provider O2, while also showcasing OpenAI's unique sense of humor. He stated in the live stream:

"Out of respect for Telefónica (the parent company of O2), and to continue OpenAI's excellent tradition of being terrible at naming things, we're calling it o3."

Invitation for Researchers to Participate in Security Testing

Currently, o3 and o3-mini are in the internal security testing phase, and OpenAI has opened applications to invite external researchers to participate in the security testing. The application will close on January 10, 2025.

Regarding the release of this model, Sam Altman boldly stated that it marks the official entry of AI development into the "next stage".

Recalling the AI hierarchy chart leaked by Bloomberg earlier this year, the next stage after chatbots and reasoning models is Agents - advanced AI systems that can take actions on behalf of users. This is the current focus of exploration and development, not only in the cryptocurrency market but also in the Web2 domain.

TOD18FoIAd | <ABMedia>-The Most Influential Blockchain News Media — OpenAI's AI hierarchy system. Source: Bloomberg

Source

Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.

Add to Favorites

Comments

Relevant content

Blockbeats

$55,000 will be the lifeline for Bitcoin.

BTC

0.5%

All-in station

The Vietnamese government is pushing for the operation of a gold and cryptocurrency exchange by February 28, 2026.

BlockTempo

Machi Big Brother lost so much money he couldn't sleep all night? Ethereum fell below 2000, causing panic. He long positions in ETH and HYPE, then sold at a loss, losing all 120,000 in magnesium.

ETH

0.97%