OpenCodePapers

robot-manipulation-on-simpler-env

Robot Manipulation
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeVisual MatchingVisual Matching-Pick Coke CanVisual Matching-Move NearVisual Matching-Open/Close DrawerVariant AggregationVariant Aggregation-Pick Coke CanVariant Aggregation-Move NearVariant Aggregation-Open/Close DrawerModelNameReleaseDate
SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation✓ Link0.7490.9230.9170.4030.6760.9070.7400.297SoFar2025-02-18
SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model0.7190.8100.6960.5930.6880.8950.7170.362SpatialVLA2025-01-27
Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy✓ Link0.6870.8370.7600.4630.6520.8550.7300.370Dita-300M2025-03-25
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control✓ Link0.6060.7870.7790.2500.6610.8230.7920.353RT-2-X2023-07-28
Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models✓ Link0.5630.7270.6630.2680.4630.6830.5600.085RoboVLM2024-12-18
RT-1: Robotics Transformer for Real-World Control at Scale✓ Link0.5340.5670.3170.5970.3970.4900.3230.294RT-1-X2022-12-13
TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies0.4600.5600.6000.2400.4500.6000.5640.310TraceVLA2024-12-13
OpenVLA: An Open-Source Vision-Language-Action Model✓ Link0.2770.1630.4620.3560.4110.5450.4770.177OpenVLA2024-06-13
Octo: An Open-Source Generalist Robot Policy0.1680.1700.0420.2270.0120.0060.0310.011Octo-Base2024-05-20