OpenCodePapers

visual-question-answering-on-coco-visual-4

Visual Question Answering (VQA)
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodePercentage correctModelNameReleaseDate
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding✓ Link66.5MCB 7 att.2016-06-06
Co-attending Free-form Regions and Detections with Multi-modal Multiplicative Feature Embedding for Visual Question Answering✓ Link66.09Dual-MFA2017-11-18
Question-Guided Hybrid Convolution for Visual Question Answering65.90QGHC+Att+Concat2018-08-08
R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering✓ Link65.69RelAtt2018-05-24
Training Recurrent Answering Units with Joint Loss Minimization for VQA63.2joint-loss2016-06-12
Hierarchical Question-Image Co-Attention for Visual Question Answering✓ Link62.1HQI+ResNet2016-05-31
Multimodal Residual Learning for Visual QA✓ Link61.8MRN + global features2016-06-05
Dynamic Memory Networks for Visual and Textual Question Answering✓ Link60.4DMN+ [xiong2016dynamic]2016-03-04
Image Captioning and Visual Question Answering Based on Attributes and External Knowledge59.5CNN-RNN2016-03-09
A Focused Dynamic Attention Model for Visual Question Answering59.5FDA2016-04-06
Stacked Attention Networks for Image Question Answering✓ Link58.9SAN2015-11-07
VQA: Visual Question Answering✓ Link58.2LSTM Q+I2015-05-03
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering✓ Link58.2SMem-VQA2015-11-17
Simple Baseline for Visual Question Answering✓ Link55.9iBOWIMG baseline2015-12-07