OpenCodePapers
video-captioning-on-vatex-1
Video Captioning
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
Show papers without code
Paper
Code
BLEU-4
↕
CIDEr
↕
METEOR
↕
ROUGE-L
↕
ModelName
ReleaseDate
↕
VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
✓ Link
45.6
95.8
29.4
57.4
VALOR
2023-04-17
VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
✓ Link
45.0
99.5
VAST
2023-05-29
COSA: Concatenated Sample Pretrained Vision-Language Foundation Model
✓ Link
43.7
96.5
COSA
2023-06-15
VideoCoCa: Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners
39.7
77.8
54.5
VideoCoCa
2022-12-09
IcoCap: Improving Video Captioning by Compounding Images
37.4
67.8
25.7
53.1
IcoCap (ViT-B/16)
2023-10-05
IcoCap: Improving Video Captioning by Compounding Images
36.9
63.4
24.6
52.5
IcoCap (ViT-B/32)
2023-10-05
Diverse Video Captioning by Adaptive Spatio-temporal Attention
✓ Link
36.25
65.07
25.32
51.88
VASTA (Kinetics-backbone)
2022-08-19
Accurate and Fast Compressed Video Captioning
✓ Link
35.8
64.8
25.3
52.0
CoCap (ViT/L14)
2023-09-22
Object Relational Graph with Teacher-Recommended Learning for Video Captioning
32.1
49.7
22.2
48.9
ORG-TRL
2020-02-26
NITS-VC System for VATEX Video Captioning Challenge 2020
20.0
24.0
18.0
42.0
NITS-VC
2020-06-07