OpenCodePapers
visual-question-answering-on-coco-visual-4
Visual Question Answering (VQA)
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
Show papers without code
Paper
Code
Percentage correct
↕
ModelName
ReleaseDate
↕
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
✓ Link
66.5
MCB 7 att.
2016-06-06
Co-attending Free-form Regions and Detections with Multi-modal Multiplicative Feature Embedding for Visual Question Answering
✓ Link
66.09
Dual-MFA
2017-11-18
Question-Guided Hybrid Convolution for Visual Question Answering
65.90
QGHC+Att+Concat
2018-08-08
R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering
✓ Link
65.69
RelAtt
2018-05-24
Training Recurrent Answering Units with Joint Loss Minimization for VQA
63.2
joint-loss
2016-06-12
Hierarchical Question-Image Co-Attention for Visual Question Answering
✓ Link
62.1
HQI+ResNet
2016-05-31
Multimodal Residual Learning for Visual QA
✓ Link
61.8
MRN + global features
2016-06-05
Dynamic Memory Networks for Visual and Textual Question Answering
✓ Link
60.4
DMN+ [xiong2016dynamic]
2016-03-04
Image Captioning and Visual Question Answering Based on Attributes and External Knowledge
59.5
CNN-RNN
2016-03-09
A Focused Dynamic Attention Model for Visual Question Answering
59.5
FDA
2016-04-06
Stacked Attention Networks for Image Question Answering
✓ Link
58.9
SAN
2015-11-07
VQA: Visual Question Answering
✓ Link
58.2
LSTM Q+I
2015-05-03
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering
✓ Link
58.2
SMem-VQA
2015-11-17
Simple Baseline for Visual Question Answering
✓ Link
55.9
iBOWIMG baseline
2015-12-07