OpenCodePapers

spatial-reasoning-on-embspatial-bench

Visual Question AnsweringSpatial Reasoning
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeGenerationModelNameReleaseDate
SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation✓ Link70.88SoFar2025-02-18
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond✓ Link49.11Qwen-VL-Max2023-08-24
GPT-4 Technical Report✓ Link36.07GPT-4V2023-03-15
Visual Instruction Tuning✓ Link35.19LLaVA-1.62023-04-17
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models✓ Link23.54MiniGPT42023-04-20