OpenCodePapers

dense-video-captioning-on-youcook2

Dense Video Captioning
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeCIDErMETEORSODABLEU4ROUGE-LF1PrecisionRecallModelNameReleaseDate
HiCM$^2$: Hierarchical Compact Memory Modeling for Dense Video Captioning✓ Link71.8412.8010.736.1132.5132.5132.51HiCM²2024-12-19
[]()67.212.310.3Vid2Seq (HowTo100M+VidChapters-7M PT)
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning✓ Link47.19.37.9Vid2Seq2023-02-27
Do You Remember? Dense Video Captioning with Cross-Modal Memory Retrieval✓ Link31.666.085.341.6328.4333.3824.76CM²2024-04-11
Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos✓ Link26.525.014.91GVL2023-03-11
End-to-End Dense Video Captioning with Parallel Decoding✓ Link22.714.744.420.8PDVC (TSN features, no SCST)2021-08-17
Multimodal Pretraining for Dense Video Captioning✓ Link39.03E2vidD6-MASSalign-BiD2020-11-10