Humanoid robot awaits ChatGPT moment

This article is machine translated

Show original

The AI wave continues to surge.

The robotics sector is a prime example. The 2025 China Robotics Industry Development Conference, hosted by the China Machinery Industry Federation and other organizations, recently held a press conference. Data released at the conference showed that the domestic robotics industry is experiencing rapid growth, with revenue increasing from 106.1 billion yuan in 2020 to 237.89 billion yuan in 2024. In the first three quarters of 2025, the domestic robotics industry's revenue increased by 29.5% year-on-year, with industrial robot production reaching 595,000 units and service robot production reaching 13.5 million sets. Both industrial robot and service robot production have already exceeded the total for 2024.

As a key carrier and core application area of AI, robots are injecting new momentum into industrial transformation. This trend is more commonly described as "embodied intelligence," which refers to intelligent agents with physical bodies capable of performing tasks in the real world through perception, decision-making, and interaction, and continuously evolving through interaction with their environment. Embodied intelligence is already driving technology from algorithmic models to the real world, thereby expanding the boundaries of AI applications and exploring more possibilities for achieving general AI.

By definition, embodied intelligence encompasses not only humanoid or other forms of robots, but also drones and smart cars equipped with AI models. Within embodied intelligence, the humanoid robot sector is particularly noteworthy. From overseas giants like Figure AI and Tesla to domestic companies like Unitree Robotics and Logic Robotics, global innovation is driving the industry forward at a rapid pace, constantly pushing the boundaries of innovation.

On October 29, 2025, Norwegian technology company 1X released its home humanoid robot NEO, opening pre-orders for approximately $20,000 (approximately RMB 142,000) or a monthly subscription fee of $499 (approximately RMB 3,500), with delivery planned for 2026. Meanwhile, Unitree Robotics brought its humanoid robot to the "Double Eleven" shopping festival, selling it on JD.com for RMB 29,900.

Since ChatGPT ignited the global AI craze in November 2022, AI has rapidly entered the public eye, transforming from an unattainable high-end technology into a tool that everyone can use. According to the "Generative Artificial Intelligence Application Development Report (2025)" released by CNNIC (China Internet Network Information Center), as of June 2025, the number of generative AI users in China had reached 515 million, with a penetration rate of 36.5%.

The development of generative artificial intelligence, also known as generative AI or AIGC, has spurred the growth of related fields, particularly the embodied intelligence industry, bringing the scenes of humans and robots coexisting, as depicted in science fiction works like "I, Robot" and "WALL-E," closer to reality. As a result, tech giants are investing heavily, and startups are vying to enter the market. In this competition concerning the future of technology, players are striving to build strong competitive advantages, attempting to be the first to create a "ChatGPT moment" with humanoid robots.

01 Solving motion problems

The evolution of humanoid robots is happening at an unprecedented pace.

The robot performances were quite eye-catching. During the CCTV Spring Festival Gala in January 2025, Unitree Robotics' H1 robot performed "Yang Bot," simply twisting its body and twirling a handkerchief to the rhythm—relatively simple movements. By October 2025, in the curtain call of the dance drama "Tiangong Kaiwu," Unitree Robotics' robots were able to accurately replicate the dancers' postures, performing fluid side flips and backflips to complete a "human-robot dance."

Videos of Unitree Robotics' robots performing have spread rapidly on platforms such as Douyin and Kuaishou, accumulating over 1.3 million likes. One user commented that the robot's movements, which seemed somewhat uncoordinated at the beginning of the year, have become so synchronized in just a few months, it's like it's "learned a martial arts manual."

The breakthrough in robotics is the result of decades of continuous technological development.

It's important to know that Alan Turing, one of the founders of AI, proposed in his 1950 paper that intelligence must rely on the dynamic interaction between physical entities and the outside world to be formed. However, due to limitations in technology, for more than half a century, robots have been far from achieving true embodied intelligence.

During the 2011 Fukushima nuclear power plant accident, no mature robots with practical operational capabilities could be found at the rescue site. The limited equipment frequently became trapped in the complex radiation environment and even tripped over scattered cables, making it difficult to perform critical tasks. Following this, DARPA (Defense Advanced Research Projects Agency) announced a robotics challenge aimed at promoting the development of disaster relief robotics technology.

The first DARPA Robotics Challenge was launched in October 2012 and the winner wasn't determined until June 2015. The final round required robots to perform tasks such as arriving at a mission area, autonomously disembarking, opening doors, closing gates, and using tools to create openings. Most of the participating robots were clumsy, frequently falling, and many were unable to complete all the tasks. The champion was HuBo, a robot developed by KAIST of South Korea, which moves not on two legs but using omnidirectional wheels to maintain speed and balance. The runner-up was Atlas, a robot developed by Boston Dynamics.

At the time, the video of the final competition sparked heated discussions among the public—the robot was slow and made many mistakes, which was completely different from the image of a sensitive and intelligent assistant that the public expected.

Boston Dynamics, founded in 1992, was once a pioneer in the humanoid robot industry and a leading global player. As early as 2017, Boston Dynamics' Atlas demonstrated a backflip. However, Atlas initially used a hydraulic drive system, which, while offering high strength and precision, also suffered from high energy consumption, high noise, and high cost, making commercialization difficult. Boston Dynamics was acquired by Google in 2013, transferred to SoftBank in 2017, and then acquired by Hyundai in 2021. During its time with SoftBank, Boston Dynamics launched its robot dog Spot for approximately $75,000 (about 530,000 RMB), selling only about 400 units.

The somersault is considered a key milestone in the development of robotics technology because it systematically integrates and promotes progress in several core areas, including robot hardware design, dynamic control, and real-time decision-making.

According to Haike Finance, to complete a somersault, the robot's drive system must release sufficient power density instantly, and the high load duration is extremely short. The system needs to solve the six-degree-of-freedom motion equations in real time, including translation in the front and back, left and right, up and down, and rotation around three axes. An angle deviation of more than 0.5 degrees may cause the robot to lose balance upon landing. The robot's ankles, knees, and hip joints need shock absorption and cushioning, requiring the foot force sensor to sense the ground reaction force and respond within 0.01 seconds.

New players like Unitree Robotics have abandoned hydraulic drive solutions in favor of pure electric drive technology. This overcomes the previous issue of electric drive having lower power than hydraulic drive, and achieves a balance between performance and cost through self-developed high-torque motors and lightweight structural design. For example, the Unitree G1 uses a design with 23-43 joint motors, achieving a maximum joint torque of 120 N·m, which allows it to maintain overall stability even in maneuvers like side somersaults that require extremely high lateral inertia control.

Boston Dynamics also launched the electric version of Atlas in April 2024, marking a wider acceptance of electric drive technology. In February 2025, the Chinese company ZQGame successfully completed the world's first robotic front flip, achieving a significant technological breakthrough. Compared to the more common backflip of the previous stage, the front flip places higher demands on the robot's dynamic balance, instantaneous explosive force, and precise landing control.

02 Where does intelligence come from?

Breakthroughs in high-difficulty moves such as somersaults are significant far beyond mere technical demonstrations.

These actions can systematically verify the maturity of the overall control system and key components, laying the foundation for the application of robots in complex real-world environments. In a public demonstration in September 2025, the Unitree Robot G1 was able to react quickly to continuous pushing and kicking, returning to a standing position, demonstrating considerable motor intelligence.

This marks another acceleration in the process of robots moving from the laboratory to the complex real world.

Since the breakthrough at AIGC in 2022 and Tesla's unveiling of the Optimus robot prototype, the global humanoid robot industry has entered a period of rapid development. A research report released by Guotai Haitong Securities in November 2025, citing multiple data sources, shows that 104 humanoid robot companies were registered in China in 2024, representing a year-on-year growth of 104%. Humanoid robots are also a hot spot for investment and financing. From January to July 2025, the domestic humanoid robot industry saw 101 financing deals, raising over 26 billion yuan, exceeding the total financing amount for the entire year of 2024. Before 2024, the humanoid robot industry was in the experimental testing phase, with products being prototypes, mostly limited to fewer than 10 units. From 2024 to 2025, the industry entered the trial production phase, with some leading companies initiating pilot deliveries of dozens to hundreds of units. After 2025, the industry will enter the mass production phase.

It's important to note that players in the robotics industry can be broadly categorized into two development paths based on their business focus: hardware-oriented and software-oriented. Hardware-oriented companies focus on the robot itself as their core entry point, concentrating on the independent research and development of key components such as joint modules, motors, reducers, and controllers, with a particular emphasis on breakthroughs in motion control algorithms. This is similar to the human cerebellum, and their products are typically measured primarily by load capacity, speed, and motion performance; examples include Boston Dynamics and Unitree Robotics.

Software companies tend to focus on embodied intelligence technologies, using cutting-edge visual language models, world models, and simulated synthetic data—these are the starting points for their research and development. They typically integrate the robots by sourcing components from external suppliers, emphasizing the robots' cognitive and decision-making intelligence, such as Galaxy General. Meanwhile, automakers like Tesla, with their large-scale manufacturing capabilities, can leverage their deep hardware manufacturing background and software accumulation from autonomous driving to demonstrate full-stack capabilities in the robotics field, integrating hardware and software.

In the early stages of artificial intelligence technology development, robots rely entirely on precise trajectory codes written by engineers to perform tasks, making them essentially no different from traditional production equipment. Just as the backflip demonstrated by Boston Dynamics' Atlas in 2017 was essentially the precise execution of a pre-programmed sequence.

Subsequently, robot learning entered a data-driven phase, autonomously learning skills through observation, imitation, and repeated trial and error. Further, intelligent systems and autonomous learning became deeply integrated, enabling robots to understand abstract instructions, proactively attempt solutions in unfamiliar environments, and gradually evolve into autonomous intelligent agents capable of coping with complex realities. As a result, global players showcased their strengths in algorithms.

After announcing the termination of its collaboration with OpenAI in February 2025, leading international player Figure AI shifted its focus to developing its own end-to-end AI models. It is claimed that its large-scale AI model, Helix, has achieved significant technological breakthroughs. Helix is the first to introduce a dual-system approach into its VLA (Visual-Language-Motion) model. System 1 focuses on real-time action control, processing visual information with extremely high response speed; System 2 possesses powerful scene understanding and language parsing capabilities, responsible for interpreting complex instructions, identifying environmental elements, and formulating action plans. The dual-system architecture also offers significant advantages in modular iteration capabilities. Both systems can be optimized independently without requiring a complete overhaul of the overall model.

In September 2025, Chinese player Logic Robotics announced the full open-sourcing of its general embodied base model GO-1. This model adopts the innovative ViLLA architecture, making it the world's first open-source general embodied intelligent model to use this architecture. The full name of the ViLLA architecture is Vision-Language-Latent-Action, which can effectively bridge the semantic gap between image and text input and the robot's final action execution by introducing implicit action tags, enabling the robot to more accurately understand human instructions and translate them into fine movements.

In addition, emerging players such as Physical Intelligence and Skild AI in the United States are exploring the cutting-edge field of world modeling, aiming to enable robots to build an internal physical world model so that they can predict the consequences of their actions.

03 Many challenges remain.

The robotics industry has begun to build a systematic technology development framework.

Analogous to the L1 to L5 autonomous driving classification system, Zhiyuan Robotics has proposed a G1 to G5 embodied intelligence technology roadmap. According to Haike Finance, G1 is customized for specific scenarios and lacks cross-scenario transfer capabilities; G2 can understand multi-scenario tasks and achieve limited generalization by combining large language models; G3 shifts to end-to-end data-driven operation, achieving a paradigm shift at the architectural level; G4 introduces generalized operational models and simulation data, significantly improving performance in complex tasks; and G5, as a long-term goal, will achieve fully end-to-end autonomous operation from perception to execution.

Generalization is a crucial challenge as robotics technology advances to higher levels.

Generalization refers to a robot's ability to flexibly perform multiple tasks in different scenarios without needing to be retrained or adjusted for each new scenario. Currently, skills that robots master in specific environments are difficult to effectively transfer to new scenarios, tasks, or objects. A robot might be able to accurately perform grasping actions in a laboratory environment, but its performance will significantly decrease or even completely fail if the cup is changed to a different shape or the lighting conditions are altered.

The root of this challenge lies in the infinite complexity of the real world. The real environment is open and dynamically changing; the combinations of object shapes, materials, placement angles, and factors such as lighting and background interference are virtually endless. Humans cannot pre-program for all possibilities, nor can they collect training data covering all edge cases. Faced with a highly reflective tabletop, a suddenly appearing pet, or an oddly shaped everyday object, the accuracy of a robot's operation will decrease significantly. This means that true autonomous intelligence is still a long way off.

The Beijing Yizhuang Robot Marathon, ridiculed by many netizens, is a case in point. In April 2025, this humanoid robot marathon attracted over 300 well-known robotics and intelligent manufacturing companies. Participating robots required engineering teams to accompany them throughout the event, responsible for changing batteries and handling unexpected situations such as loss of balance. Videos of the competition showed some robots tripping and falling on flat ground; some wobbling and unable to walk in a straight line; and some even losing their heads.

Videos of 1X's Neo robot showcase the current state of so-called home robots. Neo took over a minute to complete a basic task like fetching water from a refrigerator 3 meters away, a task a human could perform in just a few seconds. Users cannot understand and execute this complex task using natural commands like "Please tidy my room." 1X's frank demonstration of the presence of a remote operator indicates that current robots have not yet overcome the core technological bottleneck of autonomously coping with open environments.

The acquisition and use of robot training data is also a key issue that urgently needs to be addressed.

If training data is generated using a virtual simulation environment, the robot must confront the problem of the Sim-to-Real Gap. The physical parameters, sensor noise, and environmental interactions in the virtual world cannot fully replicate the complexity of the real scene, causing algorithms that perform well in simulation to experience significant performance degradation when transferred to physical robots.

Relying entirely on collecting data in real-world environments presents significant challenges due to high time costs and hardware wear and tear, hindering large-scale deployment. For example, Tesla opted to integrate its Dojo training center, allowing the Optimus humanoid robot team to abandon traditional motion capture technology and instead adopt a pure video learning method. This method enables robots to observe video recordings of humans performing tasks, autonomously extracting behavioral patterns and generating operational strategies.

Despite the numerous challenges still facing robotics technology, embodied intelligence, as a core direction at the forefront of science and technology, is experiencing an unstoppable surge in development. In this emerging field, domestic companies have actively deployed their resources and made significant progress, demonstrating a rapid pace of advancement.

The policy level has also released clear signals of support. In March 2025, the State Council's Government Work Report explicitly stated that the country will prioritize the cultivation and expansion of emerging and future industries. For the first time, the report listed embodied intelligence alongside bio-manufacturing, quantum technology, and 6G in the scope of future industry development, elevating the development of embodied intelligence to a national strategic level. Beijing, Hangzhou, and other cities have also released targeted policy documents focusing on embodied intelligence and the robotics industry, aiming to accelerate technological breakthroughs and industrial agglomeration.

Currently, the evolution of robotics is remarkably similar to the development path of the smartphone industry. In the early stages, each manufacturer operated independently, with technologies exploring different paths, gradually moving towards key breakthroughs. Just as the "iPhone moment" redefined the form and ecosystem of mobile terminals, the robotics field will also reach its own tipping point in the future—when a technology or product emerges with an experience exceeding user expectations, it will rapidly drive the unification of industry standards and the formation of an ecosystem.

This breakthrough will not merely be an improvement in technical parameters, but a fundamental transformation of the user experience. In the field of AI, it's more like the emergence of ChatGPT, transforming AI from a laboratory concept into everyday life. Mass production is only the first step in a long journey; the acceleration of technology is already evident, and the day when intelligent robots from science fiction enter ordinary households may not be far off.

This article is from the WeChat official account "Haike Finance" , author: Xu Junhao, and published with authorization from 36Kr.

Sector:

Metaverse

Play To Earn

Source

Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.

Add to Favorites

Comments

Relevant content

BeInCrypto Việt Nam

Which altcoins are big investors buying after the early November crypto crash?

TechFlow

Hyperliquid in its prime: Why is this crypto "outlier" growing against the trend?

MarsBit

How are those who followed CZ doing now?