OpenCodePapers

dense-video-captioning-on-youcook2

Dense Video Captioning

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	CIDEr	METEOR	SODA	BLEU4	ROUGE-L	F1	Precision	Recall	ModelName	ReleaseDate
HiCM$^2$: Hierarchical Compact Memory Modeling for Dense Video Captioning	✓ Link	71.84	12.80	10.73	6.11		32.51	32.51	32.51	HiCM²	2024-12-19
[]()		67.2	12.3	10.3						Vid2Seq (HowTo100M+VidChapters-7M PT)
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning	✓ Link	47.1	9.3	7.9						Vid2Seq	2023-02-27
Do You Remember? Dense Video Captioning with Cross-Modal Memory Retrieval	✓ Link	31.66	6.08	5.34	1.63		28.43	33.38	24.76	CM²	2024-04-11
Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos	✓ Link	26.52	5.01	4.91						GVL	2023-03-11
End-to-End Dense Video Captioning with Parallel Decoding	✓ Link	22.71	4.74	4.42	0.8					PDVC (TSN features, no SCST)	2021-08-17
Multimodal Pretraining for Dense Video Captioning	✓ Link					39.03				E2vidD6-MASSalign-BiD	2020-11-10