OpenAI GPT-5.3-Codex-Spark is now available: Pro users can get early access and receive faster responses.

Codex-Spark is designed for real-time collaborative programming, with a dual-track automation mechanism.

OpenAI states that its recently launched cutting-edge models can autonomously execute complex tasks for extended periods, operating continuously for hours, days, or even weeks without human intervention. Codex-Spark, on the other hand, is the first model specifically designed for "real-time collaborative programming with Codex," allowing developers to instantly request code modifications, logic adjustments, and interface tweaks, and immediately see the results. This represents two automated workflow modes currently offered by Codex:

"One type is long-term, long-task automated execution, and the other type is real-time interaction, rapid modification, and instant feedback."

OpenAI stated that it will gradually expand the functionality and scope of openness based on feedback from developers' actual usage.

Low-latency resources are limited, and traffic throttling may occur during peak hours.

During the research preview phase, Codex-Spark provides a 128k context length, supports only text input, and has independent traffic and rate limits, without consuming the quota of a standard model. OpenAI also reminds users that due to the use of special low-latency computing resources, queuing or temporary access restrictions may occur during peak usage periods to maintain overall service stability.

Codex-Spark optimizes interactive programming, balancing speed and performance.

Codex-Spark is optimized for interactive programming scenarios, emphasizing that speed and capability are equally important. Users can interrupt or adjust the direction in real time during model operation, and quickly and repeatedly modify the content.

To ensure rapid response, OpenAI's system adopts a lightweight workflow by default, making only the minimum necessary modifications and not automatically executing tests unless explicitly requested by the user. Official examples include application scenarios such as creating a Snake game, planning projects, and translating files. The image below is an official example, emphasizing:

"When making games, GPT-5.3-Codex-Spark has surpassed its previous model, GPT-5.3-Codex, in terms of coding capabilities and speed."

Performance-oriented evolution, software optimization combined with low-latency chips to assist

OpenAI stated that Codex-Spark significantly reduced the overall time to complete tasks and simultaneously optimized the entire process from request submission to response return, including a reduction of approximately 80% in client-server round-trip overhead and a reduction of approximately 30% in the processing burden per token. Furthermore, the time it takes for the dialog box to display the first response text after a user submits a request is reduced by approximately 50%, resulting in a significant improvement in overall interaction smoothness.

On the hardware side, Codex-Spark is deployed on Cerebras' Wafer Scale Engine 3 low-latency inference platform and has been integrated into OpenAI's existing production architecture. OpenAI explains that GPUs remain the core force for training and inference, responsible for large-scale and cost-effective computations, while Cerebras supplements the ultra-low latency scenarios; the two can be used together in the same workflow.

Currently, Codex-Spark is available to ChatGPT Pro users in research preview form, and the API is only available for testing by a few design partners. In terms of security, it has passed the standard assessment and has not reached the internal high-risk capability threshold. In the future, it will also develop towards a dual-mode approach that gradually integrates real-time interaction and long-term tasks.

(OpenAI releases a new Codex app for macOS! Available for a limited time to free ChatGPT users.)

OpenAI GPT-5.3-Codex-Spark is now available: Pro users can get early access and receive faster responses.

What is Cerebras? What are the motivations behind the collaboration between the two parties?

Codex-Spark is designed for real-time collaborative programming, with a dual-track automation mechanism.

Low-latency resources are limited, and traffic throttling may occur during peak hours.

Codex-Spark optimizes interactive programming, balancing speed and performance.

Performance-oriented evolution, software optimization combined with low-latency chips to assist