OpenCodePapers
dense-video-captioning-on-activitynet
Dense Video Captioning
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
Show papers without code
Paper
Code
METEOR
↕
BLEU-3
↕
BLEU-4
↕
CIDEr
↕
SODA
↕
DIV-1
↕
DIV-2
↕
RE-4
↕
BLEU4
↕
F1
↕
Precision
↕
Recall
↕
ModelName
ReleaseDate
↕
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
✓ Link
17
28
Vid2Seq
2023-02-27
Global Object Proposals for Improving Multi-Sentence Video Descriptions
✓ Link
16.36
9.45
19.40
0.60
0.78
0.05
ADV-INF + Global
2021-07-18
Team RUC_AIM3 Technical Report at Activitynet 2020 Task 2: Exploring Sequential Events Detection for Dense Video Captioning
11.28
Bi-directional+intra captioning
2020-06-14
Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos
✓ Link
10.03
33.33
7.11
GVL
2023-03-11
Dense-Captioning Events in Videos: SYSU Submission to ActivityNet Challenge 2020
✓ Link
9.71
TSRM-CMG-HRNN+SCST
2020-06-21
End-to-End Dense Video Captioning with Parallel Decoding
✓ Link
9.03
2.17
31.14
6.05
PDVC (TSP features, no SCST)
2021-08-17
TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks
✓ Link
8.75
4.16
2.02
TSP
2020-11-23
Do You Remember? Dense Video Captioning with Cross-Modal Memory Retrieval
✓ Link
8.55
33.01
6.18
2.38
55.21
56.81
53.71
CM²
2024-04-11
A Better Use of Audio-Visual Cues: Dense Video Captioning with Bi-modal Transformer
✓ Link
8.44
3.84
1.88
BMT
2020-05-17
iPerceive: Applying Common-Sense Reasoning to Multi-Modal Dense Video Captioning and Video Question Answering
7.87
2.93
1.29
iPerceive (Chadha et al., 2020)
2020-11-16
Multi-modal Dense Video Captioning
✓ Link
7.31
2.6
1.07
MDVC
2020-03-17
VTimeLLM: Empower LLM to Grasp Video Moments
✓ Link
27.6
5.8
VTimeLLM
2023-11-30