Sora has fallen from grace. Is the release of GPT-5 a redemption or a subversion?

07-25

This article is machine translated

Show original

In today's fast-paced life, short videos have become one of the main ways to seize spare time and relieve stress in busy lives. In the pursuit of short and fast experiences, "short dramas" have gradually become a very popular form of content.

In 2023 alone, the market size of China's online short dramas has reached 37.39 billion yuan, a year-on-year increase of 267.65%. In addition, Douyin's public data in 2024 showed that the number of daily deduplicated users of its short dramas exceeded 100 million.

Lei Jun, chairman and CEO of Xiaomi, also said recently: "Short dramas seem to have opened up a new world. They are faster, more exciting and better to watch than cool novels."

While the short dramas are becoming popular, some creators have also discovered the value of AI in the process. The first AIGC original fantasy short drama "Shan Hai Qi Jing" has quickly become popular on major video platforms since its launch on July 13, and has been viewed more than 10 million times on Kuaishou. Through the clever use of AI technology, the mythological characters and strange creatures described in "Shan Hai Jing" have been transformed from text into vivid images on the screen. With its realistic and smooth expression, it has successfully broken people's previous stereotypes about the effects of AI video production.

In addition, "Sanxingdui: Future Revelation", produced by Bona Film Group's AIGMS Production Center, has also achieved remarkable results and received positive responses since its launch. Bona Film Group CEO Jiang Defu said that Bona adopted the film industrialization process and used AI to produce this short drama. The purpose is to use its mature film experience to improve the technical content of AI short dramas and tell Chinese stories well through the AI short drama track.

It can be said that the "out-of-circle" of AI short dramas has taken advantage of "the right time, the right place, and the right people". From production tools to platforms to audiences, the complete full ecological chain has created a fertile soil for its development.

The success of these works is not only a technological breakthrough, but also a microcosm of the application of multimodal large models in artistic creation. It not only demonstrates AI's processing capabilities in vision and hearing, but also achieves a deep understanding and innovative expression of cultural elements through deep learning and natural language processing technology.

01 Expectations lowered, what can OpenAI do to save itself

In this thriving scene, one can’t help but recall the former “concept god” - Sora.

As a new generative video model released by OpenAI, it indeed caused an unprecedented sensation when it was first released. When OpenAI officially unveiled Sora in February, the global Internet and social media were instantly shocked by its powerful functions, as if recreating the glorious moment of the release of GPT-3.5.

Once Sora was released, it quickly became the focus of the technology community with its three core advantages. The ability to generate ultra-long videos up to 60 seconds and break through the 4-second continuity bottleneck of previous AI video generation models has amazed the industry and the public. Secondly, Sora not only supports multi-angle lenses, but also can achieve smooth one-shot shooting. The generated images can perfectly show the light and shadow relationship, physical occlusion and collision effects in the scene, making the video content more vivid and realistic.

At that time, Sora was also regarded by OpenAI as a "world simulator", not just a video generation model, but also an intelligent tool that can understand and simulate the physical laws of the real world.

In the early days of its release, people were amazed at the technological innovation and convenience brought by Sora. Many professionals predicted that Sora would become a revolution in the field of video production and completely change the traditional way of video production.

However, to this day, Sora is still preparing for its official launch, including adversarial testing, in which it is rigorously tested by a red team composed of experts in various fields to identify and mitigate potential risks such as misinformation, hateful content, and bias.

At the same time, OpenAI is also allowing visual artists, designers, and filmmakers to access Sora in advance to collect feedback and improve the model, especially for the needs of creative professionals. To increase transparency and security, OpenAI is developing tools that can detect misleading content generated by Sora and plans to include C2PA metadata in the model. In addition, the company is working with policymakers, educators, and artists around the world to understand their concerns and identify positive use cases for Sora. These activities led to the delayed release of Sora.

As time went by, Sora's application did not advance as quickly as expected. Although OpenAI has made great breakthroughs in technology, it has not been able to transform this technology into a practical product and bring it to market.

For the majority of users, this contrast is undoubtedly disappointing and anxious. On the one hand, Sora can quickly change the landscape of video production, lower the threshold for creation, and allow more people to easily produce high-quality video content. On the other hand, Sora's implementation is slow, which is a "skinny reality."

Sora's dilemma is not just a delay or inadequacy in technology implementation, but more deeply reflects the general challenges that AI technology faces in the process of commercialization. From algorithm optimization to data processing, from cultivating user habits to improving market acceptance, each step requires fine polishing and time precipitation. In this fast-paced era, the mismatch between users' desire for instant gratification and the maturity curve of AI technology often leads to a huge gap between expectations and reality.

02 It is easy to build an empire, but difficult to maintain it. GPT-5 has gone from technology worship to trust crisis

In addition to Sora who was practicing in seclusion, the sudden release of GPT-4o mini caused public opinion to ferment again. Some netizens joked, "GPT-3.5 has been laid off, will GPT-5 be far behind? Ultraman: Yes!" Although the release of GPT-5 is like a mirage, most people still believe in OpenAI's technical strength.

However, competition and changes in the field of AI are also becoming increasingly fierce. Not only are more and more companies and research institutions joining the research and development and application of AI technology, but also many AI products in vertical fields are constantly emerging, winning the favor of users with more accurate positioning and more personalized services.

In contrast, OpenAI's appeal in the industry seems to have weakened, and its "dominant" situation has become increasingly difficult to maintain.

Just as OpenAI officially stopped providing API services to China and other regions on the 9th of this month, it was originally thought to be a new technological monopoly, but it turned out to be the opposite of what was expected and did not cause a stir in the country.

Facing the "supply cut" of Open AI, the reaction of domestic enterprises this time can be described as quite positive. As soon as the news broke, large model enterprises such as Zhipu AI, Baidu, Alibaba, and Tencent launched a "relocation plan" for API services, and began to absorb customers who previously used OpenAI API services by reducing prices and simplifying processes.

We don’t need to find the answer as to why they chose to give up the Chinese market, but the performance of domestic large-scale model manufacturers is enough to prove that from the perspective of market environment and large-scale model deployment conditions, domestic large-scale models are not impossible to become the users’ first choice.

In the so-called "first year of big models", we are talking about model scale and model capabilities. The growth rate of technology in just one year has begun to make companies think about how to implement and commercialize it. The recent concentrated outbreak of products such as Kuaishou Keling and SenseTime Vimi is a microcosm of technology implementation. Continuous innovation has become the cornerstone of corporate survival and development.

Big Model House believes that for OpenAI, continuous innovation means constantly exploring new areas of artificial intelligence, pushing the boundaries of technology, and creating products that can truly solve real-world problems. The launch of GPT-5 should not be just a simple upgrade of the previous generation of products, but a qualitative leap to maintain OpenAI's leadership in the field of artificial intelligence.

03 Postscript: Can multimodality become a new opportunity for overtaking others?

The popularity of AI short dramas is undoubtedly a striking phenomenon, but it is only the tip of the iceberg of the development of the domestic multimodal field. This phenomenon is far from an isolated display of technological progress, but a comprehensive manifestation of the deep integration of technological innovation and local culture, the precise capture of market demand, and the coordinated development of the entire industrial chain.

We should look beyond the specific phenomenon of AI skits. The deep integration of this technological innovation with local culture, market demand, and industrial ecology is precisely China's key advantage in the field of multimodal artificial intelligence. Whether it is accurate diagnosis in the medical and health field, intelligent transformation in the education industry, or the rapid development of intelligent manufacturing and Industry 4.0, multimodal artificial intelligence plays a vital role in the process of creating new quality productivity.

Relying on the flexibility and innovation of domestic large-scale model manufacturers in market response, the launch of trendy and high-quality content products not only consolidates the company's competitive advantage in the market, but also injects strong impetus into the sustainable development of the entire multimodal field.

Multimodal artificial intelligence is like a new starting point for the big model competition. It will not only become the core driving force for innovation and upgrading in all walks of life, but also a key factor in shaping the new global economic landscape.

This article comes from the WeChat public account "Big Model Home" , author: Wang Haoda, and is authorized to be published by 36Kr.

Source

Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.

Add to Favorites

Comments

Relevant content