OpenCodePapers

image-captioning-on-nocaps-xd-near-domain

Image Captioning
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeCIDErB1B2B3B4ROUGE-LMETEORSPICEModelNameReleaseDate
GIT: A Generative Image-to-text Transformer for Vision and Language✓ Link125.5188.975.8658.938.9563.6632.9516.11GIT22022-05-27
GIT: A Generative Image-to-text Transformer for Vision and Language✓ Link123.9288.5675.4858.4638.4463.532.8615.96GIT2022-05-27
[]()104.7684.4569.2851.131.4859.7530.3114.97VLAF2
VIVO: Visual Vocabulary Pre-Training for Novel Object Captioning101.282.8867.0148.7330.2158.7630.014.27Microsoft Cognitive Services team2020-09-28
[]()85.8179.8861.3140.2621.8453.9827.013.01test_cbs2
[]()85.7379.5162.6543.2224.9755.1326.3711.96icp2ssi1_coco_si_0.02_5_test
[]()84.5877.0556.9736.8419.8553.0628.4214.72Human
[]()74.277.6858.3137.0419.8552.6424.9711.45UpDown + ELMo + CBS
[]()61.9874.7753.6730.6613.8549.4522.559.83Neural Baby Talk + CBS
[]()56.8575.2556.9336.9120.4951.8423.610.33UpDown
[]()53.2173.6954.132.3715.9949.6321.939.26Neural Baby Talk