Physical Intelligence shows robot model with LLM-like generalization, flaws included
New robot model π0.7 demonstrates advanced learning capabilities.

US start-up Physical Intelligence has introduced π0.7, a groundbreaking robot foundation model that demonstrates early signs of compositional generalization, akin to language models. Built on Google's Gemma3, π0.7 combines a large language model with a smaller action expert, enabling the robot to learn and recombine skills from varied training data, including contextual instructions and metadata. This innovative approach allows π0.7 to perform tasks like t-shirt folding and espresso making with impressive success rates, even without direct training on those specific tasks.
Key Takeaways
- 1.
The π0.7 model uses a training recipe that incorporates contextual metadata for improved task performance.
- 2.
It achieves an 80 percent success rate in t-shirt folding using a bimanual UR5e manipulator without prior specific training.
- 3.
The model's performance raises questions about the nature of generalization versus recall in robotic tasks.
Get your personalized feed
Trace groups the biggest stories, videos, and discussions into one feed so you can stay current without scanning ten tabs.
Try Trace free