OpenCodePapers

visual-question-answering-vqa-on-ai2d

Visual Question Answering (VQA)
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeEMModelNameReleaseDate
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts82.5SMoLA-PaLI-X Specialist Model2023-12-01
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts81.4SMoLA-PaLI-X Generalist Model2023-12-01
Gemini: A Family of Highly Capable Multimodal Models✓ Link79.5Gemini Ultra2023-12-19
DUBLIN -- Document Understanding By Language-Image Network51.11DUBLIN2023-05-23