OpenCodePapers

image-captioning-on-nocaps-xd-in-domain

Image Captioning

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	CIDEr	B1	B2	B3	B4	ROUGE-L	METEOR	SPICE	ModelName	ReleaseDate
GIT: A Generative Image-to-text Transformer for Vision and Language	✓ Link	124.18	88.86	75.86	59.94	41.1	63.82	33.83	16.36	GIT2	2022-05-27
GIT: A Generative Image-to-text Transformer for Vision and Language	✓ Link	122.4	88.55	76.1	60.53	41.65	64.02	33.41	16.18	GIT	2022-05-27
[]()		106.36	85.33	70.44	52.99	34.02	60.67	31.18	15.51	VLAF2
VIVO: Visual Vocabulary Pre-Training for Novel Object Captioning		100.62	82.94	67.56	49.66	32.07	59.43	30.62	14.7	Microsoft Cognitive Services team	2020-09-28
[]()		90.73	81.84	64.09	44.03	25.66	55.41	28.39	13.5	test_cbs2
[]()		82.86	79.14	62.18	43.04	25.67	55.37	26.82	11.9	icp2ssi1_coco_si_0.02_5_test
[]()		80.61	76.89	57.3	37.78	21.49	53.47	28.53	14.99	Human
[]()		76.02	77.65	59.58	39.86	22.83	53.98	26.35	11.8	UpDown + ELMo + CBS
[]()		74.27	77.68	60.34	41.5	24.57	54.42	26.04	11.47	UpDown
[]()		62.96	76.49	56.2	33.73	15.14	50.84	23.68	10.12	Neural Baby Talk + CBS
[]()		60.89	75.91	56.78	35.58	17.39	51.42	23.8	9.81	Neural Baby Talk