OpenCodePapers

visual-question-answering-on-clevr

Visual Question Answering (VQA)
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeAccuracyModelNameReleaseDate
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding✓ Link99.8NS-VQA (1K programs)2018-10-04
MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding✓ Link99.7MDETR2021-04-26
NeSyCoCo: A Neuro-Symbolic Concept Composer for Compositional Generalization✓ Link99.7NeSyCoCo2024-12-20
Interpretable Visual Reasoning via Induced Symbolic Space✓ Link99.4OCCAM (ours)2020-11-23
Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning✓ Link99.1TbD + reg + hres2018-03-14
The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision✓ Link98.9NS-CL2019-04-26
Compositional Attention Networks for Machine Reasoning✓ Link98.9MAC2018-03-08
Learning Visual Question Answering by Bootstrapping Hard Attention✓ Link98.8CNN + LSTM + RN + HAN2018-08-01
DDRprog: A CLEVR Differentiable Dynamic Reasoning Programmer98.3DDRprog*2018-03-30
Language-Conditioned Graph Networks for Relational Reasoning✓ Link97.9single-hop + LCGN (ours)2019-05-10
FiLM: Visual Reasoning with a General Conditioning Layer✓ Link97.7CNN+GRU+FiLM2017-09-22
Explainable and Explicit Visual Reasoning over Scene Graphs✓ Link97.7XNM-Det supervised2018-12-05
Inferring and Executing Programs for Visual Reasoning✓ Link96.9IEP-700K2017-05-10
A simple neural network module for relational reasoning✓ Link95.50CNN + LSTM + RN2017-06-05
Question-Guided Hybrid Convolution for Visual Question Answering65.90QGHC+Att+Concat2018-08-08