Surging Orders Propel Humanoid Robot Stocks: Is a Breakthrough Imminent? Insights from Expert Yan Weixin on the Road Ahead

Surging

Concept stocks related to humanoid robots are surging! Are humanoid robots about to experience a breakthrough? Yan Weixin states that achieving large-scale deployment is still a long way off.

As of October 4, 2025, significant orders are being placed, with billion-yuan funds accelerating their entry into the market. Continuous investment and IPOs are emerging, driving related concept stocks to soar. The humanoid robot industry is set to experience an unprecedented collective spotlight.

On September 29, UBTECH signed another large order for humanoid robots worth 30 million yuan, bringing the total order amount close to 430 million yuan. Earlier, on July 11, the humanoid biped robot outsourcing service procurement project by China Mobile’s subsidiary, China Mobile (Hangzhou) Information Technology Co., Ltd., garnered attention with a budget of 124 million yuan, marking the largest publicly tendered order in China to date.

The enthusiasm in the capital market and the industry chain is mutually reinforcing. However, on the flip side, genuine large-scale deployment still faces numerous challenges. Yan Weixin, a doctoral advisor at Shanghai Jiao Tong University and chief scientist at the Shanghai Artificial Intelligence Research Institute, candidly stated in an interview that while leading humanoid robot companies may achieve batch deliveries in the hundreds or thousands by 2025, mainly for education, interactive services, and data collection, true large-scale mass production remains a distant goal.

Yan Weixin has extensive academic and practical experience in the humanoid robot field, having led and participated in several national-level major projects and received numerous domestic and international research awards. He acknowledged that the complexity of humanoid robots far exceeds that of any previous intelligent devices. These robots require the integration of mechanical design, sensor technology, power systems, control algorithms, and artificial intelligence.

However, currently, the hardware interfaces of different companies are incompatible, their software platforms operate independently, and data formats vary widely. This not only leads to significant redundancy and resource waste but also greatly increases the costs of system integration and industrial collaboration, thereby slowing down technological innovation and product iteration.

In Yan Weixin’s view, humanoid robots are the future’s guiding light, driving the development of multiple industries. “We need to innovate, implement, and launch simultaneously. The key technologies developed during this process can certainly be transferred to other industries,” he said.

Need for Massive Data

Interviewer: The biggest problem facing embodied intelligence training is the lack of real data. Where do you think the breakthrough lies? Is it more dependent on collecting data from physical environments or compensating through virtual simulations and world models?

Yan Weixin: The data issue of embodied intelligence is indeed one of the biggest bottlenecks. Currently, the entire industry is severely lacking in data, with only a few million interaction data points available, while the actual need may be in the tens of millions or even billions.

Unlike the data logic of large language models (LLMs), which is primarily sourced from online texts, books, and images accumulated over decades, robots require “dynamic interaction data.” This includes force feedback when fingers grasp objects and fine-tuning of the body while walking. Such data is not only scarce but also difficult to define. What exactly should be collected? Is it the robot’s movement trajectory, human operational actions, or variations in vision and force? Currently, there is no unified standard in the industry, leading to fragmentation and lack of data interoperability.

Moreover, different robot configurations exhibit significant differences in parameters and movement methods, making it challenging to reuse general datasets. Real data inherently contains sampling bias and may not cover all possible scenarios. Additionally, data formats are not standardized, and data collected by different companies cannot be shared, leading to redundant efforts.

In terms of data collection methods, remote operation is currently popular, but data quality varies. Some are trying to collect human operational data and map it to robots, which is a valid direction, but the challenge lies in accurately reproducing the force—how much effort does a human use to lift a cup, and how can a robot replicate that precisely? This is a core problem.

I believe that simulation data provides a potential solution, but it also has significant limitations. No matter how advanced the physics engine becomes, it cannot fully replicate the complexities of the real world, like intricate friction, material deformation, light scattering, and sensor noise, as well as unpredictable human behavior.

Integrating real data and simulated data is currently a viable breakthrough. The key lies in breakthroughs in new data collection technologies, which can change the cost structure and efficiency of real data acquisition. The industry is forming a consensus on the proportion of real to simulated data, with no one-size-fits-all optimal solution. Adjustments need to be made flexibly based on specific application scenarios and requirements.

Interviewer: There are many startups in the humanoid robot space, with the underlying hardware and software systems being quite fragmented. Do you think it is necessary to promote unified standards? Are there any related attempts in the industry?

Yan Weixin: In the rapid evolution of humanoid robot technology, premature or excessive standardization could introduce a series of risks. The most significant risk is technology path locking—once a certain technology is established as a standard, it becomes difficult to replace it with better solutions that emerge later, especially when many foundational technologies are still rapidly iterating.

In this situation, a tiered and categorized standardization strategy becomes a balanced approach. This strategy adopts different standardization rhythms and methods according to varying levels of technological maturity and application areas. For relatively mature fields, standards can be actively promoted, especially in areas such as data formats, communication protocols, and safety requirements for humanoid robots. For core areas where technology is still rapidly evolving, a more flexible standard strategy should be adopted. Initial releases of technical guidelines or best practices can provide industry references without mandating uniformity.

Interviewer: Both world models and VLA models are considered key technologies. Which route do you think has better prospects? Can they complement each other?

Yan Weixin: Artificial intelligence is undergoing a significant shift from perceptual intelligence to decision-making intelligence, with world models and vision-language-action (VLA) models being two focal technological pathways.

The world model is based on visual and motion data and uses generative modeling techniques to predict environmental changes and behavior outcomes. It possesses strong temporal and spatial predictive capabilities, accurately forecasting changes in the environment and vehicle movements. The world model excels in constructing challenging scenarios, managing rare yet critical extreme situations, such as emergency obstacle avoidance or driving in extreme weather conditions. While its response speed is extremely fast, it also faces challenges, such as high computational demands, with hardware costs exceeding those of VLA models by over 40%.

The VLA model merges visual input with natural language commands to directly generate executable physical actions. It abstracts and categorizes concrete scenarios and images through language, rather than merely memorizing previously observed data, thereby enhancing its generalization abilities.

Though the world model and VLA model have different technological pathways, they exhibit significant complementary potential. The world model excels in predicting environmental dynamics and understanding physical laws, while the VLA model shines in multimodal integration and semantic reasoning. The combination of the two can create a more robust and comprehensive intelligent system.

The fusion of world models and VLA models centers around “scene-specific tailoring + functional complementarity.” First, rather than creating an all-encompassing “large and comprehensive” world model, we should focus on developing “model packages” tailored to specific application scenarios. For instance, in an automobile assembly context, only relevant physics engine modules—such as those for “screws, wrenches, and car bodies”—should be retained, while irrelevant modules like “cloth simulation” should be eliminated, reducing computational demands by 70%.

Second, the world model can handle “predictions,” while the VLA model can manage “operations.” For instance, when a robot needs to screw in a bolt, the world model first predicts the “torque and angle needed,” and then the VLA model locates the bolt’s position based on visual imagery. This collaborative approach ensures accuracy in operations while reducing computational costs.

Interviewer: It is often said that humanoid robots must complete feedback within 100-300 milliseconds, but large model inference delays are often in the second range. Do you think solving the delay problem is more likely to come from optimizing computational architecture or from model-side improvements?

Yan Weixin: I believe the delay issue in humanoid robots stems from a complex technological chain: environmental perception, data processing, decision-making, and motion control. Each segment can contribute to delays, and large model inference is just one part of the chain, albeit currently the most prominent bottleneck.

Currently, cooperative computing based on “cloud-edge-end” frameworks is becoming the solution for real-time responses. Future AI systems will not be purely edge or cloud-based but will function as a layered, collaborative, and dynamically optimized intelligent framework. The cloud will handle the training of complex large-scale models, massive data integration, model version management, and distribution. Edge nodes will serve as regional hubs, processing data aggregated from multiple end devices and running models that are larger and more agile than those on the end devices but still more efficient than those on the cloud. The end devices will focus on ultra-low-latency real-time inference and high privacy tasks.

This trend will lead to the emergence of large models on the edge, deploying trimmed and optimized models directly at the terminal. This will enable completely offline intelligent control, interactive dialogue, text summarization, and content generation, achieving excellent privacy and instant response.

Lack of Commercial Appeal

Interviewer: For robots to truly enter industrial and service scenarios on a large scale, what key elements do you think are still missing in the “brain-cerebellum” collaborative system?

Yan Weixin: The coordination between the “brain” for decision-making and the “cerebellum” for control determines whether robots can efficiently and reliably complete tasks in complex and uncertain environments.

I believe the first missing piece for widespread industrial and service application of robots is a unified world model and physical reasoning. The human brain can construct a consistent and continuously updated mental model of the environment and use it for physical reasoning to predict the consequences of actions. However, existing robotic systems often lack this capability, leading to poor performance in new scenarios or tasks requiring physical intuition.

Secondly, adaptive motion planning and control is a core function of the cerebellum, but it has not yet achieved true adaptability. The human cerebellum can automatically adjust control strategies based on task demands, environmental changes, and physical states, achieving smooth transitions from gross to fine motor actions. Existing robotic systems often struggle to balance precision, speed, and robustness, making it difficult to adapt to dynamically changing environments.

Furthermore, understanding human intent and multimodal interaction, especially in service scenarios, is crucial for the natural interaction ability between robots and humans, yet current systems still fall short in understanding human intent and performing multimodal interactions. The human brain can infer others’ intentions from vague commands, gestures, eye contact, and even context, while existing robotic systems often require explicit, structured commands. The ability to comprehend non-verbal instructions is a critical missing link.

Finally, energy efficiency and real-time performance are essential. The human brain operates at approximately 20 watts yet can perform complex cognitive and motor control functions, whereas existing robotic systems often require high energy consumption and computational resources to accomplish relatively simple tasks. Optimizing the allocation of computational resources is a key challenge.

Interviewer: Will insufficient battery life become a major bottleneck for the commercialization of humanoid robots? What current explorations are underway to enhance battery life and reduce overall energy consumption in the industry?

Yan Weixin: Currently, most humanoid robots can only operate for 1-2 hours on a single charge, while actual industrial applications typically require a minimum of 4-8 hours of continuous operation. More critically, humanoid robots can demand instantaneous power as high as 30KW for high-load tasks, placing immense demands on battery discharge capabilities. The gap between energy demand and supply directly affects the practicality and economic viability of humanoid robots.

The core issue is solving the contradiction between “high power density” and “high energy density”: high power density requires instantaneous burst power (like bipedal jumping), while high energy density necessitates long endurance (such as 8 hours of continuous work). Existing lithium iron phosphate and ternary lithium batteries cannot meet both requirements simultaneously. The future direction is “heterogeneous battery systems,” which involve pairing different types of batteries and utilizing battery management systems (BMS) for intelligent switching.

Interviewer: Currently, humanoid robots are mostly used in performance and guiding scenarios. What key pieces are still needed for them to transition to large-scale industrial applications?

Yan Weixin: In my view, for humanoid robots to achieve genuine industrial application, they must overcome multiple obstacles related to technology, cost, ecosystem, and policy.

Technologically, many of the dance moves demonstrated are pre-set and trained in advance, lacking true decision-making capabilities in real scenarios. This “pseudo-intelligence” dilemma severely limits the applicability of robots in complex industrial environments. Humanoid robots need to manage highly complex and dynamically changing scenarios, yet existing systems often require data to be re-collected and retrained, a process that can take days and fails to meet the high real-time demands of production environments.

Regarding cost and commercialization barriers, the current unit cost of high-end humanoid robots ranges from 200,000 to 400,000 yuan, with investment return periods stretching from 15 to 30 months, presenting a lack of commercial appeal. Additionally, the lack of a comprehensive testing and validation system is another obstacle to the industrialization of humanoid robots. Industrial applications demand high reliability and safety, yet there is a lack of authoritative testing platforms and evaluation standards to validate robot performance in various scenarios.

Original article by NenPower, If reposted, please credit the source: https://nenpower.com/blog/surging-orders-propel-humanoid-robot-stocks-is-a-breakthrough-imminent-insights-from-expert-yan-weixin-on-the-road-ahead/

Like (0)
NenPowerNenPower
Previous October 4, 2025 1:47 pm
Next October 4, 2025 3:21 pm

相关推荐