Google DeepMind released Gemini Robotics On-Device, a new language model enabling local, offline task execution on robots, building on the March Gemini Robotics release.
Gemini Robotics On-Device functions by controlling a robot’s movements. Developers can utilize natural language prompts to fine-tune the model for specific applications. Google reports that the model’s performance in benchmarks is comparable to its cloud-based Gemini Robotics counterpart. The company also states that Gemini Robotics On-Device surpasses other on-device models in general benchmarks, though it did not identify those specific models.
Video: Google
Demonstrations of the model’s capabilities included robots unzipping bags and folding clothes. Google DeepMind initially trained the model for ALOHA robots.
Researchers suspect DeepSeek cloned Gemini data
Subsequently, it was adapted for operation on a bi-arm Franka FR3 robot and Apptronik’s Apollo humanoid robot. Google indicates that the bi-arm Franka FR3 successfully managed scenarios and objects not encountered during its training, such as assembly tasks on an industrial belt.
Video: Google
Google DeepMind is also making available a Gemini Robotics SDK. This SDK allows developers to train robots on new tasks using the MuJoCo physics simulator. The training process involves providing 50 to 100 demonstrations of the desired tasks.