OpenCodePapers
fs-mevqa-on-sme
Explanatory Visual Question Answering
FS-MEVQA
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
Show papers without code
Paper
Code
BLEU-4
↕
METEOR
↕
ROUGE-L
↕
CIDEr
↕
SPICE
↕
Detection
↕
ACC
↕
#Learning Samples (N)
↕
ModelName
ReleaseDate
↕
Few-Shot Multimodal Explanation for Visual Question Answering
✓ Link
67.91
50.55
79.41
510.44
64.09
29.09
51.45
16
MEAgent
2024-10-28
GPT-4 Technical Report
✓ Link
45.51
35.17
52.67
269.68
37.67
7.00
42.30
16
GPT-4-1106-Vision-Preview
2023-03-15
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
✓ Link
41.87
34.61
55.90
276.14
40.58
1.40
40.88
16
Gemini-1.5 Pro
2024-03-08
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
✓ Link
24.30
23.40
34.52
201.47
26.13
1.05
40.33
16
Qwen-VL-Max
2023-08-24
CogVLM: Visual Expert for Pretrained Language Models
✓ Link
14.45
17.53
24.28
127.37
17.70
0.89
34.23
16
GLM-4V
2023-11-06
Variational Causal Inference Network for Explanatory Visual Question Answering
✓ Link
9.17
19.82
33.34
4.28
13.39
0.28
17.77
16
VCIN
2023-01-01
REX: Reasoning-aware and Grounded Explanation
✓ Link
0.00
4.37
23.23
0.89
0.00
0.00
17.77
16
REX
2022-03-11