OpenCodePapers

visual-question-answering-on-coco-visual-1

Visual Question Answering (VQA)
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodePercentage correctModelNameReleaseDate
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding✓ Link70.1MCB 7 att.2016-06-06
Co-attending Free-form Regions and Detections with Multi-modal Multiplicative Feature Embedding for Visual Question Answering✓ Link70.04Dual-MFA2017-11-18
R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering✓ Link69.60RelAtt2018-05-24
High-Order Attention Models for Visual Question Answering✓ Link69.33-Modalities: Unary + Pairwise + Ternary (ResNet)2017-11-12
Training Recurrent Answering Units with Joint Loss Minimization for VQA67.3joint-loss2016-06-12
Multimodal Residual Learning for Visual QA✓ Link66.3MRN2016-06-05
Hierarchical Question-Image Co-Attention for Visual Question Answering✓ Link66.1HQI+ResNet2016-05-31
A Focused Dynamic Attention Model for Visual Question Answering64.2FDA2016-04-06
VQA: Visual Question Answering✓ Link63.1LSTM Q+I2015-05-03
Simple Baseline for Visual Question Answering✓ Link62.0iBOWIMG baseline2015-12-07