Humanoid robots attract attention, but industrial use depends on measurable performance rather than appearance. Fraunhofer IPA has developed a modular benchmark to assess humanoid robots against application-relevant criteria such as safety, cleanroom suitability, cybersecurity, energy use, and basic capabilities. The aim is to give manufacturers and users an independent basis for comparing systems and planning realistic deployments.
The market for humanoid robots is developing quickly, yet reliable technical comparisons remain difficult. Demonstrations often show selected capabilities under controlled conditions. However, production environments demand repeatability, safe interaction, stable operation, and clear limits. For companies considering investment, that creates a practical problem: a robot may look capable, but its suitability for a specific industrial task can be hard to judge.
Fraunhofer IPA has addressed this gap with a standardized analysis service. Research teams guide humanoid robots through defined tests and evaluate the results scientifically. The benchmark was developed with funding from the Baden-Württemberg Ministry of Economic Affairs, Labour, and Tourism as part of the AI Innovation Center “Learning Systems and Cognitive Robotics.” Its modular structure allows manufacturers, end users, and software providers to select the areas most relevant to their intended application.
Where possible, the benchmark uses established international standards, including ISO 14644 for cleanroom suitability and ISO 10218 and ISO TS 15066 for functional safety. Therefore, the evaluation is linked to criteria already familiar in industrial automation.
Measuring capabilities beyond demonstrations
The benchmark starts with technologies and basic capabilities. Fraunhofer IPA examines installed sensors, AI models, gripper types, walking speed, gripping forces, and manageable loads. Measurements are recorded with a 3D tracking system and force sensors, which makes the results less dependent on subjective observation.
A second area focuses on more complex capabilities. These include stair climbing, obstacle navigation, movement and force accuracy, and reaction speed. The tests are deliberately demanding so results can remain useful as future robot generations become more capable. For industrial users, this matters because basic movement alone is not enough. A humanoid robot must handle realistic variations in its environment without creating unacceptable process risks.
The value of this approach lies in comparability. A company can assess whether a robot’s actual performance fits the intended use, rather than relying on general claims about mobility or dexterity. Moreover, it allows humanoids to be compared not only with each other, but also with existing automation components that already have known performance limits.
Safety and cleanroom performance under scrutiny
Functional safety is central when humanoid robots work near people. Fraunhofer IPA’s tests include stability on different surfaces, force limitation during collisions, obstacle detection, and system behavior during failures. Collision tests use the same force sensors that are used for collaborative industrial robots, making the measurements relevant for human-robot collaboration.
Cleanroom suitability is assessed through particle emission according to ISO 14644-14, outgassing behavior, and cleanability. These criteria are important for sectors such as semiconductor manufacturing, pharmaceuticals, and food production, where contamination can directly affect product quality or process reliability.
Fraunhofer IPA applied the full benchmark for the first time to a Unitree G1 EDU-4 with Dex3-1 3-finger hands, delivered in May 2025 with firmware version 1.04. The robot showed good self-stabilization and could be suitable for ISO Class 5 cleanrooms. At the same time, the tests revealed significant limitations. In collision situations, forces above 500 newtons occurred, which is far above the pain thresholds permitted by the standard. For users, such results are essential because they define where a robot can be used safely and where additional measures or different applications are required.

Cybersecurity and energy use as deployment factors
The benchmark also covers cybersecurity, divided into four modules: vulnerability management, secure lifecycle, network security, and penetration resistance. These aspects are becoming more important as regulatory requirements increase and connected automation systems become part of production infrastructure.
In the Unitree G1 evaluation, Fraunhofer IPA identified a critical Bluetooth security vulnerability in the software at that time. The vulnerability allowed attackers to take complete remote control of the robot. According to the source information, this issue has since been resolved. The case illustrates why cybersecurity is not a secondary concern for humanoid robots. A mobile system operating near people, equipment, or sensitive processes must be assessed not only mechanically, but also as a connected device.
Energy efficiency is another practical factor in deployment planning. Fraunhofer IPA measures battery life and power consumption in different scenarios, including standing, walking, walking uphill, and walking with a load. For the tested Unitree G1, the maximum operating time on one battery charge was 2 hours and 49 minutes while standing still. In a typical scenario involving both standing and walking, it was 1 hour and 49 minutes. Such figures help companies estimate charging cycles, availability, and whether a robot can support the intended shift pattern.
A framework for investment decisions
Fraunhofer IPA positions the benchmark as a tool for more transparent decision making in a market that remains volatile and difficult to assess. Simon Schmidt, Senior Manager of the Business Unit Automated Systems at Fraunhofer IPA, notes that users and manufacturers need to look beyond the facade sometimes created by marketing. Werner Kraus, head of the research division “Automation and Robotics,” emphasizes that users can interpret the results directly and identify the right humanoid for the right application.
The timing is relevant. Demographic change is increasing pressure to automate work that has often been performed manually. At the same time, major investment decisions require objective evaluation criteria, while specific safety standards for humanoids are not expected until 2028 with ISO 25785-1. Cybersecurity requirements are also increasing, and sensitive production environments need reliable data to prevent contamination.
Fraunhofer IPA plans to test additional humanoid robots and establish a comparative database. Manufacturers and users can now commission individual benchmark modules or comprehensive evaluations, using the institute’s existing infrastructure and expertise to obtain application-specific data before making deployment decisions.














