OpenCodePapers

video-captioning-on-msvd-1

Video Captioning

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	CIDEr	BLEU-4	METEOR	ROUGE-L	GS	ModelName	ReleaseDate
MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks	✓ Link	195.6					MaMMUT	2023-03-29
VLAB: Enhancing Video Language Pre-training by Feature Adapting and Blending		179.8	79.3	51.2	87.9		VLAB	2023-05-22
VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset	✓ Link	178.5	80.7	51.0	87.9		VALOR	2023-04-17
COSA: Concatenated Sample Pretrained Vision-Language Foundation Model	✓ Link	178.5	76.5				COSA	2023-06-15
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video	✓ Link	165.8	70.5	48.4	85.3		mPLUG-2	2023-02-01
HowToCaption: Prompting LLMs to Transform Video Annotations at Scale	✓ Link	154.2	70.4	46.4	83.2		HowToCaption	2023-10-07
HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training		146.9	71.0	45.3	81.4		HiTeA	2022-12-30
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning	✓ Link	146.2		45.3			Vid2Seq	2023-02-27
An Empirical Study of End-to-End Video-Language Transformers with Masked Visual Modeling	✓ Link	139.2					VIOLETv2	2022-09-04
RTQ: Rethinking Video-language Understanding Based on Image-text Model	✓ Link	123.4	66.9		82.2		RTQ	2023-12-01
Accurate and Fast Compressed Video Captioning	✓ Link	121.5	60.1	41.4	78.2		CoCap (ViT/L14)	2023-09-22
Diverse Video Captioning by Adaptive Spatio-temporal Attention	✓ Link	119.7	59.2	40.65	76.7		VASTA (Vatex-backbone)	2022-08-19
IcoCap: Improving Video Captioning by Compounding Images		110.3	59.1	39.5	76.5		IcoCap (ViT-B/16)	2023-10-05
SEM-POS: Grammatically and Semantically Correct Video Captioning		108.3	60.1	38.5	76.0	607.1	SEM-POS	2023-03-26
Diverse Video Captioning by Adaptive Spatio-temporal Attention	✓ Link	106.4	56.1	39.1	74.5		VASTA (Kinetics-backbone)	2022-08-19
IcoCap: Improving Video Captioning by Compounding Images		103.8	56.3	38.9	75.0		IcoCap (ViT-B/32)	2023-10-05