OpenCodePapers

fs-mevqa-on-sme

Explanatory Visual Question AnsweringFS-MEVQA

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	BLEU-4	METEOR	ROUGE-L	CIDEr	SPICE	Detection	ACC	#Learning Samples (N)	ModelName	ReleaseDate
Few-Shot Multimodal Explanation for Visual Question Answering	✓ Link	67.91	50.55	79.41	510.44	64.09	29.09	51.45	16	MEAgent	2024-10-28
GPT-4 Technical Report	✓ Link	45.51	35.17	52.67	269.68	37.67	7.00	42.30	16	GPT-4-1106-Vision-Preview	2023-03-15
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context	✓ Link	41.87	34.61	55.90	276.14	40.58	1.40	40.88	16	Gemini-1.5 Pro	2024-03-08
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond	✓ Link	24.30	23.40	34.52	201.47	26.13	1.05	40.33	16	Qwen-VL-Max	2023-08-24
CogVLM: Visual Expert for Pretrained Language Models	✓ Link	14.45	17.53	24.28	127.37	17.70	0.89	34.23	16	GLM-4V	2023-11-06
Variational Causal Inference Network for Explanatory Visual Question Answering	✓ Link	9.17	19.82	33.34	4.28	13.39	0.28	17.77	16	VCIN	2023-01-01
REX: Reasoning-aware and Grounded Explanation	✓ Link	0.00	4.37	23.23	0.89	0.00	0.00	17.77	16	REX	2022-03-11