
Beijing’s humanoid robots have achieved a groundbreaking milestone in global scene perception and dynamic memory. On May 13, 2026, the Beijing Humanoid Robot Innovation Center showcased the latest advancements of their humanoid robot’s “brain” through a live broadcast. Equipped with the universal embodied intelligence platform, Wise Thinking and Action, this intelligent system has successfully realized the first instance of global scene perception and dynamic memory in the industry, allowing it to be aware of its environment and keep track of details. This development lays a solid foundation for the future integration of humanoid robots into home, commercial, and industrial settings.
Last year, the Beijing Humanoid Robot Innovation Center launched the world’s first general-purpose embodied intelligence platform, Wise Thinking and Action, capable of “one brain, multiple machines” and “one brain, multiple abilities.” This platform revolutionized traditional robotic development, which typically focused on specific tasks within single scenarios, enhancing robots’ capabilities for autonomous decision-making and execution in complex environments.
Currently, agents utilizing the Wise Thinking and Action platform have made key advancements in spatial memory, enabling robots to transition from passive execution to proactive operation. They have evolved from performing short-sighted, simplistic tasks to tackling complex, long-range challenges.
Traditional robots relied on instantaneous visual input, operating under the principle of “what you see is what you get.” If an object left their field of vision, it was as if it had “disappeared,” and robots would “forget” when the scene changed, making it difficult for them to handle complex tasks like humans could.
During the live demonstration, the Wise Thinking and Action agent showcased its ability to smoothly perform tasks such as delivering water and handing over tissues. Throughout this process, the robot demonstrated a sense of “spatial awareness.” Even when objects were out of sight, the robot could accurately locate them, moving beyond simple reactive tasks to deducing the location, state, and environmental relationship of target objects based on its spatial memory.
This achievement is attributed to the development of the industry’s first global scene perception and dynamic spatial memory system at the Beijing Humanoid Robot Innovation Center. Equipped with a “dynamic semantic map,” the system can record the types, colors, and locations of visible objects and update this information in real-time, enabling persistent memory across time and perspectives.
According to practical tests, the accuracy of this complete spatial memory system remains stable at 100% for complex multi-step tasks involving movement, perception, and grasping. Even when faced with common disturbances like viewpoint changes and object occlusion, the overall task completion rate stays above 98%.
This means that the robot genuinely possesses global spatial memory and common sense, allowing it to effectively perform tasks in environments such as retrieving items at home, sorting materials, and organizing logistics, regardless of changes in perspective or object occlusion.
However, robots still face challenges in remembering individuals and distinguishing preferences; each interaction with the same person feels like a first encounter. To address this, the Wise Thinking and Action agent uses a user memory system to facilitate personified proactive interactions. The robot can recognize individuals it has previously encountered and remember their preferences for personalized service.
For example, if a user casually mentions being “thirsty,” the robot can use facial recognition to recall the user’s preference for cola and proactively fetch it. It can also maintain context across tasks, allowing it to remember instructions like “continue what we did yesterday” or “bring me the document from last time” across different time frames.
As a result, robots can autonomously perceive their environment and proactively identify needs, effectively demonstrating “having work to do.” The advancements in this technology mean that robots are no longer just cold execution machines; they can remember, understand, and actively serve as intelligent companions.
Moreover, robots need to not only have awareness but also possess the ability to effectively grasp items. In the past, while robots could pick up certain objects, they often struggled with grip and control. To tackle this industry challenge, the Wise Thinking and Action agent incorporates a “visual + tactile” perception capability, allowing it to adjust its grip based on the characteristics of the target item. This enhancement grants the robot improved sensitivity and understanding, ultimately achieving a reliable grasping ability for safe, precise, and stable operation in the future.
Original article by NenPower, If reposted, please credit the source: https://nenpower.com/blog/breakthrough-in-global-scene-perception-and-dynamic-memory-achieved-by-beijing-humanoid-robot/
