OpenCodePapers

robot-manipulation-on-simpler-env

Robot Manipulation

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	Visual Matching	Visual Matching-Pick Coke Can	Visual Matching-Move Near	Visual Matching-Open/Close Drawer	Variant Aggregation	Variant Aggregation-Pick Coke Can	Variant Aggregation-Move Near	Variant Aggregation-Open/Close Drawer	ModelName	ReleaseDate
SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation	✓ Link	0.749	0.923	0.917	0.403	0.676	0.907	0.740	0.297	SoFar	2025-02-18
SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model		0.719	0.810	0.696	0.593	0.688	0.895	0.717	0.362	SpatialVLA	2025-01-27
Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy	✓ Link	0.687	0.837	0.760	0.463	0.652	0.855	0.730	0.370	Dita-300M	2025-03-25
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control	✓ Link	0.606	0.787	0.779	0.250	0.661	0.823	0.792	0.353	RT-2-X	2023-07-28
Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models	✓ Link	0.563	0.727	0.663	0.268	0.463	0.683	0.560	0.085	RoboVLM	2024-12-18
RT-1: Robotics Transformer for Real-World Control at Scale	✓ Link	0.534	0.567	0.317	0.597	0.397	0.490	0.323	0.294	RT-1-X	2022-12-13
TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies		0.460	0.560	0.600	0.240	0.450	0.600	0.564	0.310	TraceVLA	2024-12-13
OpenVLA: An Open-Source Vision-Language-Action Model	✓ Link	0.277	0.163	0.462	0.356	0.411	0.545	0.477	0.177	OpenVLA	2024-06-13
Octo: An Open-Source Generalist Robot Policy		0.168	0.170	0.042	0.227	0.012	0.006	0.031	0.011	Octo-Base	2024-05-20