OpenCodePapers

image-captioning-on-nocaps-xd-out-of-domain

Image Captioning

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	CIDEr	B1	B2	B3	B4	ROUGE-L	METEOR	SPICE	ModelName	ReleaseDate
GIT: A Generative Image-to-text Transformer for Vision and Language	✓ Link	122.27	86.28	71.15	52.36	30.15	60.91	30.15	15.62	GIT2	2022-05-27
GIT: A Generative Image-to-text Transformer for Vision and Language	✓ Link	122.04	85.99	71.28	52.66	30.04	60.96	30.45	15.7	GIT	2022-05-27
VIVO: Visual Vocabulary Pre-Training for Novel Object Captioning		95.5	79.44	61.15	41.03	21.79	55.49	26.56	12.66	Microsoft Cognitive Services team	2020-09-28
[]()		91.62	74.84	53.9	33.51	16.6	51.5	26.83	14.21	Human
[]()		90.34	79.59	61.04	40.09	19.61	54.86	26.14	13.11	VLAF2
[]()		85.28	75.59	56.71	35.63	17.72	51.92	23.77	11.28	icp2ssi1_coco_si_0.02_5_test
[]()		77.94	74.5	53.63	30.91	13.41	49.66	23.47	11.07	test_cbs2
[]()		66.67	71.57	48.58	25.77	9.68	47.13	20.88	9.74	UpDown + ELMo + CBS
[]()		58.48	65.98	43.2	21.16	7.5	44.47	19.04	8.77	Neural Baby Talk + CBS
[]()		48.73	64.45	42.8	21.48	7.92	44.11	18.31	8.2	Neural Baby Talk
[]()		30.09	66.54	44.28	24.23	10.17	44.84	18.29	8.08	UpDown