Evaluating Robotics: The Sole Criterion for Success According to Fan Haoqiang of Yuanli Lingji

Evaluating

Dialogue with Fan Haoqiang of Yuanli Lingji: There is Only One Metric for Judging Robots

Author: Liu Xin

Date: February 27, 2026

Introduction: Embodiment does not have a unified path; it diverges at the foot of the mountain and will eventually converge at the summit.

Fan Haoqiang is a humorous person. When we asked him about the challenges that embodied intelligence needs to address in the next decade, he jokingly replied that perhaps robots would awaken and wipe out humanity, thus eliminating the need for a next decade. While this was a jest, discussing the industry’s development inevitably invites some science fiction-like speculation. In his view, AI should be a lifelong endeavor. After all, he won a gold medal in the International Olympiad in Informatics in high school and has been a genuine AI researcher since his second year.

The name Yuanli Lingji may sound unfamiliar, but it has significant roots, as it is an embodied intelligence company spun off from Megvii. Its founder is one of Megvii’s co-founders and among its earliest employees. Shortly after its establishment, the company secured close to 1 billion yuan in two rounds of financing.

Our curiosity about Yuanli Lingji primarily focuses on a few points: Do they aim to develop models or entities? What is their business model? As a new company originating from Megvii, what advantages do they possess? How can they stand out in the current competitive landscape?

Having interacted with numerous clients during his time at Megvii, Fan Haoqiang’s understanding of embodied intelligence is heavily influenced by his experiences there. From day one of his startup journey, he has felt the strong demand from clients for embodied intelligence. Yet, he regrets that many of these needs cannot currently be met. The capabilities of robots remain extremely limited, while the precision and efficiency requirements of production lines are astonishingly high. Previously, Megvii could achieve an accuracy rate of 99.99999999999% in facial recognition, but now, the simplest robot grasping task may have a success rate of less than 50%.

Throughout our conversation, the most notable impression of Yuanli Lingji is that the company does not focus on capital stories or AI gimmicks. Fan Haoqiang discussed DFOL (Yuanli Lingji’s native production workflow for embodiment), emphasizing the importance of identifying where to start for the first step in realizing embodied intelligence. He raised questions about where to discover its native applications and how to progress from the exceedingly rare to the increasingly common, ultimately aiming for infinite generalization.

During the DFOL launch event, Fan introduced the concept of a general-purpose robot. However, he questioned how to achieve generality. When technology does not reach perfection, what should be done? Will less general embodied intelligence be criticized for failing to meet expectations and become obsolete? Fan provided an answer: historically, most technological developments follow the pattern of being just good enough to be useful. It may seem unreasonable at first, but if it works, it will eventually gain acceptance. In this context, “usable” equates to “reasonable.” Those with engineering experience know that it’s essential not to set expectations too high for generality.

This interview illustrates how a company focused on finding real-world scenarios and solving genuine problems in embodied intelligence operates. It allows readers to understand the dilemmas and efforts of an entrepreneur while presenting his most authentic reflections on the application of models, entities, and business.

01 Achieving SOTA (State of the Art) is Our Self-Positioning and Confidence

AI Technology Commentary: I first heard about you when someone told me that when he joined Megvii, he sat next to a genius high school student who scared him. That student was you, Fan Haoqiang.

Fan Haoqiang: Yes, I joined Megvii in 2012 as its sixth employee. At the time, Tang Wenbin was my coach for the Olympiad in Informatics. He asked me if I wanted to join a project that guaranteed a spot in college while also providing a salary. I asked what it was about, and he told me it was facial recognition. I was still in high school and knew nothing when I joined.

AI Technology Commentary: When did you start thinking about robotics?

Fan Haoqiang: Specifically, that was in 2016. During my senior year, I visited a Stanford lab for two months. Who was there? Su Hao, Lu Ce Wu, Wang He, Yi Li, and Shao Lin. Su Hao, the senior, led us in 3D generation. I asked him why he was working on 3D. He told me that once you do 3D generation, you can do 3D discrimination, then move on to robot simulation, and eventually, we can work on robotics in ten years!

AI Technology Commentary: Indeed, ten years later, you’re all working on robotics! Can you talk about how Yuanli Lingji was established?

Fan Haoqiang: I wanted to place AI in robotics; it was the biggest challenge. However, we lacked a crucial ingredient: AI. Although large models emerged, we didn’t know how to integrate them with robotics. Thankfully, in 2024, several key works emerged from the U.S., like Action Transformer and Diffusion Policy, culminating in a comprehensive work called Pi. This clarified the roadmap for applying Transformer technology to matrices. I felt that the conditions were ripe to form a team. My first thought was to find a CEO, and luckily, Tang Wenbin was available, so I brought him on board. After he saw the technological advancements, he message me at 2 AM saying, “This is a once-in-a-lifetime opportunity to work on general robotics.” I replied, “Why are you more excited than I am?” We then found hardware partners Zhou Erjin and Wang Tiancai, and with Tang Wenbin focusing on customer scenarios and my role in AI, we formed a unique team in the country that combines these three aspects.

AI Technology Commentary: When was the company officially founded?

Fan Haoqiang: Around March 2025, after securing the first round of financing, the company was established.

AI Technology Commentary: How do you feel about entering the embodied intelligence industry this past year?

Fan Haoqiang: I feel that the pace of development is completely out of control because the entire industry is evolving at an incredible rate. We used to write papers referring to traditional methods, but now two months ago’s methods are already considered classic. When our company just started in 2025, we were somewhat in the dark—not because we didn’t know what to do, but because we had too many ideas. There were interests in humanoid robots, decentralized collection concepts, and tactile sensors; so many possibilities and concepts floated before us. It wasn’t until a complete year in 2025 that we realized models are the main focus. The quality of the model determines which scenarios can be utilized, and which of those scenarios will define the hardware, which in turn dictates how the data should be handled. The model’s capability is the independent variable driving the whole development. Once we recognized this, the core became about creating excellent models through the best algorithms, optimal data, and top-tier engineering. From there, it was clear what the company should pursue. This process allowed me to gradually find the main axis and storyline during my entrepreneurial journey.

AI Technology Commentary: Building the best model is inherently challenging.

Fan Haoqiang: It certainly is, but many in our team take pride in our work. Back when we were focused on vision, we never settled for second best. It’s hard to fathom that after putting in effort and clarifying our goals, the final score or outcome wouldn’t be SOTA! We believe that research has a scientific nature; once you understand it, good results should follow. This is a form of self-positioning and confidence.

AI Technology Commentary: Is the model our most powerful asset right now?

Fan Haoqiang: We have two open-source resources. One is a training codebase called Dexbotic, or DB for short. The second is a testing framework we built, called RoboChallenge, internally referred to as RC. These are technically challenging to create, and by open-sourcing them, we showcase our entire team’s capabilities and technical prowess. Many companies have done codebase open-sourcing, but many just provide a “README-style open source,” which is essentially an empty file. After open-sourcing, it sparked a lot of interest, and now five or six companies are applying to conduct tests. Currently, we have two key assets: foundational capabilities and the model, both of which display our team’s strength and align with the direction we want to pursue.

AI Technology Commentary: With so many strong players in the model field, are you concerned?

Fan Haoqiang: The strongest are still focused on LLMs.

02 Real Scenarios Represent the Toughest Challenges

AI Technology Commentary: Besides models, are we also working on entities?

Fan Haoqiang: Yes, I believe it’s quite clear that we must develop the machine entities ourselves.

AI Technology Commentary: How does this relate to our business model?

Fan Haoqiang: We have been selling software for over a decade and understand that most users or clients want a total solution. In China, there isn’t a prevalent practice among major companies to achieve vertical integration through acquisitions. Therefore, we believe it’s essential to create value for clients end-to-end, ensuring control over every link in the process to achieve the best quality and serviceability. Even if many people haven’t dealt with hardware like motors before, they still have to start from scratch to meet the project’s requirements.

AI Technology Commentary: Doesn’t working on both models and hardware make things even more difficult?

Fan Haoqiang: I’ve mentioned before that many of us are hands-on with the robots. Currently, we still have many DIY robots in the company. Everyone believes that the algorithm engineers should first build a robot themselves to understand the potential issues that might arise, which will help them think through those problems when developing algorithms.

AI Technology Commentary: Isn’t that tough on algorithm engineers?

Fan Haoqiang: If they do well, they feel very satisfied.

AI Technology Commentary: You haven’t built one yourself, have you?

Fan Haoqiang: I probably have the most DIY robots in the company. Initially, I built a robot at home that could fold blankets, which cost around 2000-3000 yuan and was made entirely from parts bought on Taobao. It wasn’t a robotic arm, just a stick with a clamp. The real challenge was figuring out how to use a small clamp to manipulate the blanket into a fold. The design was crucial.

AI Technology Commentary: How has building DIY robots influenced your thoughts on algorithms?

Fan Haoqiang: I realized that hardware often involves trade-offs. You can build something for 200,000 yuan or for 2000 yuan, but whether something is useful ultimately depends on the algorithm, which dictates the movement paths. Understanding this gave me hope; as long as we excel in our algorithms, everything can fall into place.

AI Technology Commentary: When designing the entire loop from software to hardware, what scenarios were you considering?

Fan Haoqiang: Initially, we had simple ideas. At Megvii, we had over 500 quality clients, and many companies had departments focused on forward-thinking technological transformations. Every year, they would ask if we had any new technologies. Facial recognition was once considered AI, and now large models are seen as AI; robotics is the next evolution of AI. From day one, we sensed the strong demand from clients, but regrettably, we are currently unable to meet many of those needs. Even the simplest sorting tasks, involving tens of thousands of SKUs in a warehouse, remain unresolved by current algorithms. Clients ask us annually to report on AI advancements and inquire if we can initiate projects to implement these technologies. Thus, we aren’t overly concerned about application scenarios; we have been involved in AI-driven transformations for a long time and understand how to approach this work.

AI Technology Commentary: What do you see as the biggest challenge ahead?

Fan Haoqiang: The real challenge lies in addressing genuine problems in real scenarios. Throughout the AI 1.0 phase, we observed that companies often touted their capabilities before generating real revenue. However, once a product is sold, it is no longer about what they claim; it is about whether customers can actually use it. This is a highly objective and verifiable metric, a key factor in tempering the industry’s initial excitement into something more sustainable. Today, many boast about their model’s insights, but honestly, such claims are unfalsifiable; there are numerous evaluation metrics, and one can always find a favorable one. Hence, the real usage by real clients is the only definitive metric. During one discussion, someone asked what metrics should be used to measure robots. There were many responses—success rate, stability, and so forth—but I believe the most critical metric is: how long does it take for a robot to pay for itself? This single metric is paramount; everything else is secondary.

AI Technology Commentary: Are we currently capable of meeting this metric?

Fan Haoqiang: While we are developing models and gradually implementing applications, I’ve noticed a fascinating phenomenon: the toughest tasks to test aren’t those that currently have a 0% success rate in standardized datasets like table30. Instead, we have a specific collection of the simplest problems derived from our clients, and those simplest problems turn out to be even harder than our most challenging tests. There’s a common joke in the computer industry: the hardest test is normal users using the system normally. No matter how much regression testing you do, when it comes to real scenarios, it can all fall apart. Robotics is similar; nearly every valuable task has a small step in the entire process that tests both the robot’s precision and intelligence, truly putting us to the test. Thus, our next breakthroughs must target more authentic and challenging matters. Once everyone has real customers and implementations, the industry’s landscape will become clearer.

AI Technology Commentary: What are some of the simplest problems you’ve encountered in client scenarios?

Fan Haoqiang: We have several gathered samples, one of which left a deep impression on me: it involves flipping two interlocked objects.

AI Technology Commentary: What drives this kind of demand?

Fan Haoqiang: Their process demands it. This is a genuine problem; you don’t ask why. They have been doing this for ten years and insist on that method, so you have to trust them, okay?

AI Technology Commentary: Understood.

Fan Haoqiang: We discovered that getting the robot to perform this task is incredibly challenging; it struggles to grip the objects. Consequently, when we filmed the demo, we designed an entire series of robotic movements: first gripping here, then scooping from there, moving to another location, and then proceeding to the next step. Mechanically, this sequence is feasible, but the complexity of the movements makes it impossible for the model to learn. So, this illustrates a real problem; everything is interlinked. When it comes to implementation, you’ll find that the task becomes a tough nut to crack. Once you take a bite, you realize there’s still another layer to peel. Only by truly reaching the core and tackling all these issues will you understand the depth of the challenges involved. Before final implementation, you can only discover problems you weren’t aware of earlier.

AI Technology Commentary: Does this mean it could take ten years to implement?

Fan Haoqiang: Just because a task is difficult doesn’t mean it’s unsolvable. Difficulties require thinking; not only is the model the foundation, but product design, business, and customer collaboration must also be creatively integrated to achieve success. The good news is that more peers are gradually emerging with real implementation projects. A client may present 100 demands, but they might find one that coincidentally aligns perfectly with the timing, location, and conditions, allowing for a solution to be found, and thus the robot can be utilized. This is just the first step. Tasks that once required a rare combination of factors can potentially be expanded in the future, allowing models to enhance from one in a hundred to one in ten, ultimately achieving the goal of completing any task that comes along.

AI Technology Commentary: Recently, I’ve seen news about robots being pushed out of factories. What do you think about this?

Fan Haoqiang: That’s quite normal. There is a significant gap between proof of concept (POC) and actual business implementation. We have had a profound experience in non-standard visual intelligence, where clients usually welcome us for POC, but if we make a mistake that affects their core operations, they will definitely put us through the wringer before going live.

AI Technology Commentary: You have quite a lot of experience being put through that wringer.

Fan Haoqiang: Ultimately, this will drive the definition of technology. For instance, when we worked on face recognition, the industry was unaware of the ultimate misrecognition rate, which reached 11 nines, or 99.999999999%. Therefore, when you use facial recognition, you hardly notice any errors; this is the result of being pressured into developing algorithms. If a robot works continuously for a year, given the number of frames it processes, the potential for error is massive because this involves real-time video processing. If it makes a mistake in a production environment, the consequences can be catastrophic. Hence, robotic algorithms will also undergo a process before truly being implemented. Embodiment is akin to another form of autonomous driving; the perception and decision-making processes involved in autonomous driving are no secret. Gather 100 million kilometers of data and conduct thorough model training, and the model will ultimately provide reliable output. Robotics is similar, except that the robot’s body is somewhat smaller, and the tasks are more diverse.

AI Technology Commentary: Embodied tasks could be infinitely varied, making them more complex than autonomous driving, right?

Fan Haoqiang: In my personal view, I don’t think this wave of embodiment will solve all of robotics’ issues. Ten years ago, we already told every good story about AI, and when it comes to specific implementation directions, they boil down to just a few. Now, the implementation directions for LLMs are similar; coding is one, chatting is another, and there are actually limited options. Therefore, we believe AI is a long-term endeavor. In this wave, we can push robotics to perform significantly better than before; this may be the answer for this decade.

AI Technology Commentary: What about the next decade?

Fan Haoqiang: There may not even be a next decade; robots could awaken and eliminate humanity, and we wouldn’t have to consider it.

AI Technology Commentary: Let’s hope that situation doesn’t arise.

Fan Haoqiang: I believe technology will eventually progress to a state that, while not perfect, is just good enough to be useful. Historically, most technological advancements follow this pattern: they may seem unreasonable at first, but if they prove to be functional, they will eventually be adopted widely.

03 Finding Suitable Application Scenarios Based on Robot Attributes

AI Technology Commentary: I see that Yuanli Lingji is currently working on DFOL, the world’s first embodied intelligent application production workflow. Can you explain what an embodied intelligent application is?

Fan Haoqiang: For instance, an industrial six-axis robot is an application, but you wouldn’t typically refer to it as embodied intelligence, right?

AI Technology Commentary: It’s just an automated hardware device.

Fan Haoqiang: Exactly. I believe embodied intelligent applications comprise an entire system. For example, current embodied hardware is typically designed to resemble a quasi-humanoid form, providing a degree of generality. Additionally, you would expect its movements to be dexterous and complex, rather than just point-to-point actions like XYZ coordinates. Embodied hardware, paired with suitable sensors and a robust model, forms a system distinctly different from traditional industrial automation. The initial motivation for establishing this system was the belief that it could ultimately achieve full generality. However, as we progress halfway along this path, we must identify what it excels at at this stage.

AI Technology Commentary: So, are we still in the phase of searching for its native applications, or have we already found them?

Fan Haoqiang: We do have some client cases now, but we prefer not to publicize them widely, as we fear competitors might jump in. This is a genuinely profitable venture, so we won’t open source it (laughs).

AI Technology Commentary: What has been the most challenging aspect in the process of pursuing DFOL?

Fan Haoqiang: The biggest challenge is genuinely finding the right scenarios and customers.

AI Technology Commentary: Are you the one approaching clients?

Fan Haoqiang: I worked with Wenbin to approach many clients. Recently, we made a concentrated effort to visit various factories and returned with insights about where the opportunities lie. Additionally, we really need to understand what these models excel at. Some actions that seem very challenging can actually be quickly learned by the model, while seemingly simple actions can take ages to master. Ultimately, it is the design of the model’s actions that is crucial.

AI Technology Commentary: Can you share a case regarding your action design?

Fan Haoqiang: For instance, when getting a robot to fold clothes, pinching the fabric from the top is less successful than pinching it from the side. We need to clarify these nuances, which requires a group of skilled professionals, the talented data collectors who may be the seeds of our future.

AI Technology Commentary: They are essentially the interpreters translating human actions into robotic actions.

Fan Haoqiang: Exactly. They need to adopt the robot’s perspective to understand what kind of actions are effective and easy to learn. They cannot just think like humans; they need to think like robots to tackle this issue. Ultimately, a dedicated team must focus on the DFOL domain. Many industries recognize the concept of FAE (Field Application Engineering), which is essential for supporting custom software.

AI Technology Commentary: So, this data is collected during the action and then fed back to create a closed loop?

Fan Haoqiang: This is also a core algorithm mentioned in Pi 0.6, called RECAP. There are various other names for it, like DAG, and Shanghai Zhiyuan refers to it as SOP. The principle remains the same: if the robot is about to make an error, a human quickly corrects it and records the corrective signal to allow the network to learn from it. It’s amazing how it can adapt; after only a few corrections, it begins to avoid making the same mistakes. This showcases the impressive nature of neural networks—they are quite responsive.

AI Technology Commentary: After this data is returned, we still need to retrain the model?

Fan Haoqiang: Absolutely. During the ramp-up phase, both data collection and training happen in parallel. Eventually, when I monitor and see that the average unassisted time has reached a certain metric, I can cease updates, and it transitions to a passive data collection mode where the model remains unchanged. However, if it encounters a bad case today, the data will still be sent back to inform future model developments.

AI Technology Commentary: So, we have already shipped some of the machine entities?

Fan Haoqiang: Our company was founded in March 2025, and some products from project funds are already being used in pilot programs with clients. Our goal is to launch a standardized hardware product for clients by 2026.

AI Technology Commentary: Since we are targeting specific factory clients, is there still a need to develop a unified hardware product?

Fan Haoqiang: Components like grippers or end effectors may vary; some clients need rigid ones, while others prefer soft ones. However, stabilizing the overall platform for the robot is crucial for data accumulation and model learning. Our company strategy emphasizes the need to quickly converge on our main model.

AI Technology Commentary: Will we eventually develop robots aimed at more consumer-oriented applications or more generalized ones?

Fan Haoqiang: That is part of our vision; however, it seems we need to wait a bit longer.

AI Technology Commentary: Will you consider providing intelligence for certain entity companies?

Fan Haoqiang: Currently, we will not. That sector is already crowded, and it is not our area of expertise.

AI Technology Commentary: From your perspective, what is Yuanli Lingji’s ecological position in the industry?

Fan Haoqiang: I hope it can be a leader in technology and a pioneer in applications.

04 Diverging at the Foot of the Mountain, Converging at the Summit

AI Technology Commentary: How do models and hardware relate to each other?

Fan Haoqiang: Hardware itself is a science; there are no magic solutions. Issues like reliability, structure, and rigidity all have corresponding methodologies. As long as these issues are addressed and validated during design, the final product will be of good quality. Currently, the challenges in hardware are similar to those in models; while locomotion is generally understood, manipulation in robotics is particularly problematic. For example, consider the wrist; a human can easily reach into a desk drawer, but robots struggle with this. Many clients have requested us to tackle this case, only to find that we fail at the very first step, making it impossible to even discuss further challenges. Thus, we believe that hardware must also be driven by applications to achieve practical implementations. We have a slogan: “Models determine scenarios; scenarios define hardware.”

AI Technology Commentary: Is your base model training considered fast?

Fan Haoqiang: When we run it on GPUs, it takes just a few weeks. However, first, we need to clarify what to run and how to run it, which requires substantial time for iteration and preparation of data.

AI Technology Commentary: What do you need to run, and how do you run it?

Fan Haoqiang: We need to determine the training parameters and data distribution for the base model and figure out how to do this reasonably. These factors truly decide the model’s final capability. We have incorporated thousands of hours of self-collected data, gathered just an hour or a minute prior.

AI Technology Commentary: Your data collection efforts are quite solid.

Fan Haoqiang: Indeed! Fortunately, we have been collecting data for facial recognition for ten years. Some data collectors are deeply committed to the craft. The most proactive among them often come and ask how the data they collected impacts model performance, and they think about how to improve their data collection methods for the next batch.

AI Technology Commentary: They have transitioned from just a job to a profession, right?

Fan Haoqiang: Yes, it’s quite remarkable, and I consider it one of the joys in this work. Our company has a showcase where we list top contributors to our data set, ensuring that future generations remember these pioneers.

AI Technology Commentary: They should be recognized as the great benefactors of silicon-based life.

Fan Haoqiang: Data collectors also need to achieve a human-machine synergy. The tasks are quite challenging; achieving precision within a few tenths of a millimeter requires intensive practice over days.

AI Technology Commentary: How will the next generation of embodied models differ from this generation? What directions will they take?

Fan Haoqiang: Models generally have four key indicators: generalization, intelligence, dexterity, and efficiency. This generation focuses more on dexterity and a certain degree of generalization, but I believe the next generation will need to achieve an order-of-magnitude improvement in these indicators. Currently, many tasks may only achieve an 80-90% success rate, but in the future, for basic tasks, we must aim for 99% or even 99.9% success rates. Furthermore, while most tasks we measure are completed within ten seconds, the future will require us to tackle longer tasks lasting from minutes to hours.

AI Technology Commentary: There are various training paths for embodied models now—some focus on simulation, some on VLA, and others on world models. Is this a good thing?

Fan Haoqiang: It’s great that everyone is sticking to their own path. If the technical routes become too homogeneous, it wastes the opportunity for trial and error. We will likely continue with a combination of pre-training and real-machine approaches. It’s beneficial for everyone to pursue different methods, allowing us to learn from each other and gain insights about what others are doing. If everyone adopts the same approach, what is there to compete on?

AI Technology Commentary: Ultimately, won’t everyone converge on one path?

Fan Haoqiang: I don’t think so. It’s more likely that we will diverge at the foot of the mountain and converge at the summit. For instance, those focused on simulation are continuously creating 3D assets, while those working on physical data collection research how to augment those assets. In the end, they will find that the issues are the same. No matter your starting point or methods, you will ultimately find corresponding solutions to the larger challenges. I genuinely believe that these divergences in technical routes are not fundamental; the differences stem from whether you solve the problems during implementation. If you can solve them, you will succeed. This is what we call reductionist thinking, a very Megvii style. For instance, Zhang Xiangyu wrote significant papers, including one on ConvNeXt, stating that just because others were using Transformers, he could achieve results with convolutional methods.

AI Technology Commentary: But didn’t everyone eventually get unified under Transformers?

Fan Haoqiang: Today, Transformers have been modified beyond recognition. The so-called Dswin (sliding attention window) structure is indistinguishable from convolution. I believe there is no real difference. Those working on Transformers have ultimately returned to convolution, while convolution enthusiasts have embraced Transformers. In essence, it’s all about the same goal. I dislike creating conceptual divisions or oppositions among teams; we believe there is only one truth in the world, but many methods to achieve it.

Original article by NenPower, If reposted, please credit the source: https://nenpower.com/blog/evaluating-robotics-the-sole-criterion-for-success-according-to-fan-haoqiang-of-yuanli-lingji/

Like (0)
NenPowerNenPower
Previous March 2, 2026 10:51 pm
Next March 3, 2026 12:26 am

相关推荐