Paper | Code | SODA | CIDEr | METEOR | ModelName | ReleaseDate |
---|---|---|---|---|---|---|
[]() | 0.151 | 50.9 | 9.5 | Vid2Seq (VidChapters-7M PT) | ||
HiCM$^2$: Hierarchical Compact Memory Modeling for Dense Video Captioning | ✓ Link | 0.150 | 51.2 | 9.6 | HiCM² | 2024-12-19 |
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning | ✓ Link | 0.135 | 43.5 | 8.5 | Vid2Seq | 2023-02-27 |
End-to-end Dense Video Captioning as Sequence Generation | 25.0 | 8.1 | E2ESG | 2022-04-18 |