Paper | Code | Accuracy | ModelName | ReleaseDate |
---|---|---|---|---|
ProTo: Program-Guided Transformer for Program-Guided Tasks | ✓ Link | 65.14 | ProTo | 2021-10-02 |
Learning by Abstraction: The Neural State Machine | ✓ Link | 63.17 | NSM | 2019-07-09 |
MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding | ✓ Link | 62.45 | MDETR-ENB5 | 2021-04-26 |
LXMERT: Learning Cross-Modality Encoder Representations from Transformers | ✓ Link | 60.3 | LXMERT | 2019-08-20 |
Language-Conditioned Graph Networks for Relational Reasoning | ✓ Link | 56.1 | single-hop + LCGN (ours) | 2019-05-10 |
GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering | ✓ Link | 54.06 | MAC | 2019-02-25 |
GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering | ✓ Link | 46.55 | CNN+LSTM | 2019-02-25 |