Paper | Code | Average | Put Spoon on Towel | Put Carrot on Plate | Stack Green Block on Yellow Block | Put Eggplant in Yellow Basket | Put Eggplant in Yellow Basket | ModelName | ReleaseDate |
---|---|---|---|---|---|---|---|---|---|
SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation | ✓ Link | 0.583 | 0.583 | 0.667 | 0.708 | 0.375 | SoFar | 2025-02-18 | |
SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model | 0.344 | 0.208 | 0.208 | 0.250 | SpatialVLA | 2025-01-27 | |||
Octo: An Open-Source Generalist Robot Policy | 0.300 | 0.472 | 0.097 | 0.042 | Octo-Small | 2024-05-20 | |||
Octo: An Open-Source Generalist Robot Policy | 0.160 | 0.125 | 0.083 | 0.000 | Octo-Base | 2024-05-20 | |||
Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models | ✓ Link | 0.135 | 0.208 | 0.250 | 0.083 | 0.000 | RoboVLM | 2024-12-18 | |
RT-1: Robotics Transformer for Real-World Control at Scale | ✓ Link | 0.011 | 0.000 | 0.042 | 0.000 | RT-1-X | 2022-12-13 | ||
OpenVLA: An Open-Source Vision-Language-Action Model | ✓ Link | 0.010 | 0.000 | 0.000 | 0.000 | OpenVLA | 2024-06-13 |