OpenCodePapers
science-question-answering-on-scienceqa
Question Answering
Science Question Answering
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
Show papers without code
Paper
Code
Avg. Accuracy
↕
Natural Science
↕
Social Science
↕
Language Science
↕
Text Context
↕
Image Context
↕
No Context
↕
Grades 1-6
↕
Grades 7-12
↕
ModelName
ReleaseDate
↕
Boosting the Power of Small Multimodal Reasoning Models to Match Larger Models with Self-Consistency Training
✓ Link
94.88
97.47
90.44
93.18
96.97
93.75
94.49
95.3
94.13
MC-CoT F-Large
2023-11-23
Honeybee: Locality-enhanced Projector for Multimodal LLM
✓ Link
94.39
95.20
96.29
91.18
94.48
93.75
93.17
95.04
93.21
Honeybee
2023-12-11
[]()
92.53
LLaVA (+ GPT-4)
Multimodal Chain-of-Thought Reasoning in Language Models
✓ Link
91.68
95.91
82.00
90.82
95.26
88.80
92.89
92.44
90.31
Multimodal CoT
2023-02-02
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
✓ Link
90.99
90.41
95.05
88.91
89.64
88.05
90.94
91.19
90.64
Chat-UniVi-13B
2023-11-14
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
✓ Link
75.17
75.44
70.87
78.09
74.68
67.43
79.93
78.23
69.68
GPT-3 - CoT (QCM→ALE , 2-shot)
2022-09-20
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
✓ Link
74.61
76.60
65.92
77.55
75.51
66.09
79.58
78.49
67.63
GPT-3 - CoT(QCM→AE, 2-shot)
2022-09-20
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
✓ Link
74.11
71.00
76.04
78.91
66.42
66.53
81.81
77.06
68.82
UnifiedQA-BASE - CoT (QCM→ALE)
2022-09-20
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
✓ Link
73.97
74.64
69.74
76.00
74.44
67.28
77.42
76.80
68.89
GPT-3 (QCM→A, 2-shot)
2022-09-20
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
✓ Link
70.0
Video-LaVIT
2024-02-05