Humanoid Robots Take Center Stage Again at Spring Festival Gala: Investors Demand Practical Applications Beyond Dance

Humanoid

Humanoid Robots Make a Splash at the Spring Festival Gala! Investors: “We Won’t Invest in Just Dance.” Three Competing Technical Paths for Humanoid Robots to Achieve Real-World Applications.

One of the highlights of this year’s Spring Festival Gala is the appearance of humanoid robots. Last year, a humanoid robot from Yushujia performed a traditional dance on stage, which sparked significant interest in the humanoid robotics industry. According to projections, the domestic market for humanoid robots is expected to see a staggering increase of over 650% in shipment volume by 2025. However, can robots that only dance still gain recognition from investors and the industry?

In a recent interview, Daily Economic News reporters spoke with investors and key product leaders in the humanoid robotics sector. The current focus in the industry has shifted towards practical applications. One investor candidly stated, “Robots that only dance won’t sell; they need to be integrated into real scenarios to survive.” They expressed skepticism towards companies that are small and merely seeking valuations and funding without substantive products.

The transition from “performing robots” to “practical laborers” raises the question: can the technical capabilities of these robots support such a shift? The discussions among interviewees highlighted the broader landscape of intelligent robotics rather than specific humanoid designs. The competition among three technical paths for robot development has reached a fever pitch: Will Figure AI’s humanoid robot startup and ZhiYuan’s “General Intelligence” VLA model be able to handle factory assembly lines? How will Tesla’s advocated “world model” utilize simulation data to lower costs? And how does Boston Dynamics’ layered decision-making ensure long-term operational accuracy?

Key factors such as endurance, stability, and cost are now critical challenges facing these three technical paths as they approach mass production—robots must learn to “work.” An investor shared their perspective: “We don’t engage with companies that simply ask for valuations and funding without a solid product. Companies that are cobbled together by just a handful of people often fail quickly.” The industry has moved past the phase where “robots that can dance sell well.”

Qiu Dicong, founder of Yakobi Robotics, remarked, “Regardless of how advanced the technology or how good the design is, it ultimately needs to result in a tradable product to create economic value.” The industry previously focused heavily on comparing various technical routes, believing that having a technical advantage could dominate the market. “But in the end, technology is just part of the equation. Sometimes, it isn’t even the most important part in the later stages of development.” Despite his background in AI robotics research, he believes that technological advancement alone does not guarantee commercial success. “In the coming period, the competition in embodied intelligence will focus on application—implementation, implementation, and implementation.” The pressing question remains: “How can we ensure that there are products with sufficient capability that are recognized by customers and can be scaled for sales?”

Qiu emphasized that regardless of the size of funding, the focus must ultimately return to the essence of business, which is the sales figures. “Otherwise, if the valuation is too high compared to sales, it will lead to a situation where a lot of funds are raised with minimal business.” Which type of robot will truly “survive” and expand its application scenarios? The three technical paths that different enterprises are betting on are starting to provide distinctly different answers.

On February 3, Tian Feng, head of the Fast and Slow Thinking Research Institute, stated: “By 2026, the bottleneck for intelligent robots in long-term operations will shift from ‘can they move?’ to ‘how long can they work?’ and ‘are they stable enough?'” He noted that 2026 is widely regarded as a critical year for robots to enter service industry applications. The current technical emphasis is shifting from motion control to enhancing the “robot brain” to improve understanding and execution in complex environments and tasks.

Dr. Lv Tong, product director at the Chengdu Humanoid Robot Innovation Center, mentioned that the industry needs to rethink how to ensure robots can execute tasks like screwing, packaging, and transporting with precision and stability similar to humans as they transition from “laboratory” to “production line.” The challenges of this transformation are particularly severe, with endurance, real-time response capabilities, and maintenance costs becoming the litmus test for all technical paths. Under these threefold challenges, the seemingly diverse technical routes are beginning to reveal their respective strengths and weaknesses.

The three technical paths are progressing simultaneously, each with its own advantages and bottlenecks. The first path is the VLA (Vision Language Action) model route, which aims for “general intelligence.” This approach seeks to allow robots to perceive through vision, understand language, and ultimately execute actions like humans. Companies like Figure AI and ZhiYuan Robotics are betting on this path. Tian analyzed the core characteristics of this route: “It relies on massive data training to cope with unknown environments and tasks, aiming for an end-to-end single model.” Its advantage lies in strong semantic understanding, capable of interpreting vague commands like “clean the table.” However, Tian also pointed out its drawbacks: “The computational cost of end-to-end models is high, requiring robust hardware for endurance and heat dissipation.” Since last year, companies like ZhiYuan Robotics and UBTech have showcased humanoid robots capable of performing operational tasks like screwing. This year at CES, some non-humanoid robot companies have also entered this space, such as SuTeng JuChuang, which demonstrated a highly stable robotic operating system.

On February 3, an AI expert from SuTeng JuChuang explained that “VLA is a technology paradigm that utilizes the emergence capabilities of large language models to achieve operational intelligence.” However, he noted the hidden challenges of this route: “Simply providing a robot with an image does not allow it to assess how far an object is from its mechanical arm. The output of VLA is a series of real-number coordinates and orientations in 3D space, indicating that an end-to-end VLA still needs to implicitly utilize a significant amount of parameters to solve spatial perception issues.” Additionally, as the robot’s hand approaches the object in the “final centimeter,” much of the contact surface is obscured by the dexterous hand itself, emphasizing the importance of tactile and force feedback.

SuTeng JuChuang’s solutions focus on two main aspects. First, they combine 3D point cloud data and tactile information with traditional pure visual VLA. “By effectively utilizing the point cloud, we significantly reduce our data requirements because this method bypasses the reliance on massive data for implicit learning of spatial perception.” Second, they treat touch as another modality input for VLA. The unnamed expert emphasized that the current tactile technology still faces three major industry challenges: one, on the hardware level, high-quality, high signal-to-noise ratio tactile sensors are still scarce; two, on the algorithm level, there is no mature method to efficiently utilize tactile data; and three, data-wise, there is a lack of large-scale public or private tactile datasets.

The second path is the world model route, with Tesla being a representative. This route focuses on constructing a “digital world” by building a physical world simulator within the AI system, enabling robots to predict the consequences of their actions. Tian summarizes it as “instilling robots with an intuitive understanding of physical laws, allowing them to reason and plan to predict the outcomes of their actions.” This path heavily relies on high-quality simulation data, but once the simulator is established, it significantly reduces the dependency on expensive real-world data.

The third path is layered decision-making and hardware-software synergy, representing a pragmatic approach, with Boston Dynamics and ZhiYuan Robotics as key players. This method breaks down complex tasks, with large models responsible for semantic understanding and subtask decomposition, while traditional algorithms handle positioning, navigation, and precision control functions. Tian points out that the modular architecture’s advantage lies in easy fault isolation, decoupling complex reasoning tasks from high-frequency real-time control, which ensures the response speed of control loops and proves beneficial in real-world assembly lines.

However, Lv believes that the various technical routes are not mutually exclusive. Layered structures, 3D scene mapping, and world models are advancing concurrently. He argues that VLA end-to-end models and world models are not opposing forces: “They need to develop in synergy.” The robotics field is fundamentally a systems engineering challenge. The choice of technology must consider deployment environments, network conditions, computational support, and more, and cannot detach performance discussions from practical conditions.

Generalization and stability are core goals, while hardware form factors are not the primary focus. Different companies provide diverse technical solutions based on their unique identities, but all routes must confront a central challenge: improving robots’ “generalization ability” to adapt to various scenarios. Qiu elaborated that the core pursuit of robot control is addressing the generalization problem. The earliest methods relied on model predictive control, enabling robots to move beyond fixed trajectories. This method, akin to solving equations (e.g., X + 1 = Y), dynamically associates environmental perception (X) with actions (Y) to adapt to changes within predetermined ranges. However, its limitation lies in its failure to cope with unforeseen situations. To overcome this limitation, the VLA model was developed, aiming to allow robots to interpret natural language commands (e.g., “put the apple on the shelf”) and autonomously complete tasks using visual perception. VLA models are typically trained on large visual language model foundations combined with human operational data, endowing them with robust understanding and generalization capabilities, but they also face challenges of expensive data, high computational consumption, and slow execution speeds.

The current technical routes can mainly be categorized into two types: model-driven methods (like model predictive control, which are stable but limited in generalization) and data-driven methods (including reinforcement learning and imitation learning). The VLA model can be seen as a combination of the latter two, representing a significant direction towards general-purpose robots. The aforementioned expert from SuTeng JuChuang stated, “The essence of generalization is interpolation.” As long as the model is exposed to a sufficiently diverse range of scenarios—such as dim or bright lighting, tables of varying heights, and different distances— it can make reasonable judgments in unknown situations. However, this is not enough; “data must be sufficiently clean. The cleaner the dataset, the easier it is for the model to achieve generalization.” He candidly noted that both the autonomous driving and robotics fields suffer from issues of “dirty data,” which can severely undermine a model’s generalization capabilities. The diversity and cleanliness of data are two different matters, a common pitfall for many practitioners. He also emphasized that enhancing the “lower limit” of AI operating systems is technically more challenging and holds greater industry value than showcasing their “upper limit.” “Even if you allow the model to attempt 100 times, only a few will showcase peak performance; but raising the lower limit means enabling the robot to work continuously for 10 hours in a factory without error, which creates true value.” Lv mentioned that industry demands are shifting from seeking a singular volume of data to focusing on “data diversification” and easier collection methods, such as video-based capturing. Additionally, the industry is exploring how to incorporate the physical and natural knowledge accumulated by human society into world models, which may become a key focus in the future.

Beyond data, computational deployment is also a critical issue. The industry widely believes that high-frequency local inference is essential for ensuring the stability of robots. If a system can achieve a reasoning frequency of 10 Hz, it means that minute disturbances can be processed within 0.1 seconds. “If the system’s reasoning frequency is only 2 to 3 Hz, it will take 0.4 to 0.5 seconds to wait, along with execution delays and reasoning desynchronization, significantly affecting task success rates.” The next three to five years will be a crucial period for the implementation of robots in specific scenarios. Xie Tiandi, director of marketing at SuTeng JuChuang, expressed that the next three to five years will be critical for the deployment of robots in specific applications. The value of robots lies in their ability to complement human labor. Human practical experience is invaluable; robots can learn and replicate the expertise and skills of seasoned workers, and customers are willing to pay for robotic solutions that can replicate human expertise. Although current embodied robots may only complete half or less of the work that humans can accomplish in the same time frame, they can work at night and during holidays. Another case from last year’s robotics conference saw many manufacturing company leaders from Jiangsu and Zhejiang provinces attending specifically to inquire about purchasing robots to set up production lines. While market demand is urgent, there remains a gap between technology and commercialization. He acknowledged that currently only entertainment robots that sing and dance can achieve stable revenue, while the entire robotics industry is still in the “transition from R&D to engineering” phase. However, the excitement generated by entertainment scenarios has significantly accelerated the development of robots’ “working” capabilities.

At present, the market demand for robots is evolving towards pragmatism. “Users wish to select specific scenarios for highly closed-loop systems,” Lv noted, adding that user demands focus on three key aspects: reducing production costs, liberating humans from repetitive and tedious or hazardous tasks, and providing emotional value in areas such as culture, commerce, and tourism. “The emergence of robots fundamentally aims to solve practical problems at various levels.” Currently, cutting-edge embodied intelligent technology is still in the research and development phase, and stability has not yet reached industrial-grade levels. Truly reliable technologies (like industrial assembly lines and household refrigerators) have become so stable that they no longer attract special attention.

Qiu concluded that factory scenarios are relatively simple, with fixed objects (like specific screws) and environments. While operations are precise, they are highly repetitive. In contrast, supermarket scenarios are more complex, requiring the identification of hundreds of thousands of products, with high demands for understanding items, but operations mainly involve “picking, placing, and arranging.” Home scenarios pose the ultimate challenge for robots: the variability of spaces and items, along with complex tasks involving dozens of procedures, demand high generalization capabilities. From an ROI perspective, home scenarios are currently not economically viable: a robot may cost tens or even hundreds of thousands of yuan, which does not match the limited services it can provide. Commercial applications are emerging as a breakthrough point. For example, in retail warehouse picking scenarios, if robots can overcome item generalization issues, they could boost operational efficiency by 30% to 90%, providing clear commercial value. However, Qiu cautioned that the current cutting-edge embodied intelligent technology is still in the R&D phase, with stability generally not meeting industrial-grade standards.

The success or failure of specific scenarios will determine the paths taken by the industry as it embraces convergence and competition in domestic technology routes. Tian analyzed, “Long-term stable operation is the ‘key to commercialization’—different technical paths dictate the cost-effectiveness and survival rates of robots in various scenarios.” “In relatively structured factory and logistics environments, extremely high VLA semantic understanding is not necessary; however, high mean time between failures (MTBF) and low power consumption are critical, making the ‘layered decision + hardware-software synergy’ path more suitable.” Tian further stated, “Modular actuator solutions have absolute advantages in production costs and later maintenance.” In complex and variable construction site environments, the world model combined with hybrid wheeled-leg architectures proves more adaptable. He cited ZhiJi Dynamics as an example: “By predicting terrain through a world model, it automatically switches movement modes to complete tasks, achieving energy efficiency 3 to 5 times higher than purely legged robots, significantly reducing endurance pressures for long-term operations.” In cultural tourism and home service scenarios, the service industry has high demands for human-robot interaction, and the VLA architecture can empower robots to interpret differentiated fuzzy instructions from human users.

Xie Tiandi believes the commercial model of the robotics industry is gradually becoming clearer: focusing on B-end (business) clients and collaborating with manufacturers and scenario providers to co-create solutions. “We need to find partners with real production scenarios, such as logistics packing and automotive parts assembly, to jointly promote solution implementation and validation.” He candidly stated that the core value of robots lies in their ability to “work alongside humans in the same environment without requiring infrastructure modifications— for example, in factories where humans work during the day and robots take over at night.” Observing the current competition in the industry, several clear development trends are emerging in the field of humanoid robots.

From the perspective of technological advancement over time, Lv believes that robot technology is evolving rapidly, iterating on a monthly basis, while the industry continues to maintain high-speed progress in both capital and technology. However, the integration of cutting-edge technology with practical applications is still in the familiarization and trial-and-error stage. “The entire industry is generally still in the familiarization process on the application side, which will inevitably be accompanied by trial and error.” He also noted that the boundaries between academia and industry are becoming increasingly blurred, with many new technologies emerging from feedback and pressure from frontline practices. Tian predicts that technical routes will gradually converge: “Drawing from the hardware development history of PCs and mobile phones, the hardware architecture of intelligent robots will gradually unify.” In the software architecture realm, “it may no longer pursue purely end-to-end but form a three-layer decoupled architecture of ‘semantic analysis layer—environment mapping layer—motion execution layer’.” On the enterprise route selection front, deep collaboration between hardware and software will become a priority direction. “Core components must deeply match algorithms; those companies that solely assemble components may be eliminated from the industry,” Tian pointed out.

A critical judgment is: “By 2026, the hardware gap among enterprises will rapidly narrow, and the true core barrier will be the non-standard environment operational data accumulated by robots during long-term operations.” The data-looping capabilities formed by robotic companies that have achieved extensive deployment will become their core competitive advantage. Another significant trend is localization. “By 2026, domestic planetary roller screw and high-power density servo motors will gradually replace imported products, and intelligent robots will trend towards self-research and development transformation and integration optimization using domestic components,” Tian summarized.

In Xie’s view, the ultimate value of robots is not to replace humans but to inherit their expertise, functioning during times when humans rest or cannot adapt to harsh environments. This involves converting the skills of veteran workers and the experience of seasoned experts into data models, enabling a fleet of robots to supplement human labor. “This is truly the future of industrial intelligence.” Qiu concluded that while robot technology is undoubtedly important—driving innovations in productivity, efficiency improvements, and enhanced experiences—it must be placed in a reasonable context. Technology is a means to achieve exceptional products, not the end goal itself. The successful implementation of robots hinges on the perfect alignment of technology with commercial scenarios. “If you can solve 90% of the problems but cannot address the remaining 10%, the entire scenario becomes unusable, rendering the previous 90% meaningless.” This underscores the necessity for enterprises to comprehensively consider whether the sophistication of technology aligns with scene demands, the stability and reliability of the robots, the design aesthetics and user interaction experience, and whether the overall solution can form a closed loop within acceptable ROI parameters for customers. Every detail that impacts the final experience constitutes a decisive factor in product capability. Whether in entrepreneurship or emerging technologies, it ultimately boils down to a simple question: Is this thing useful? And are you willing to pay for it? If yes, then it is a success.

Original article by NenPower, If reposted, please credit the source: https://nenpower.com/blog/humanoid-robots-take-center-stage-again-at-spring-festival-gala-investors-demand-practical-applications-beyond-dance/

Like (0)
NenPowerNenPower
Previous February 8, 2026 2:10 pm
Next February 8, 2026 6:14 pm

相关推荐