OpenCodePapers
zero-shot-video-retrieval-on-youcook2
Zero-Shot Video Retrieval
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
Show papers without code
Paper
Code
text-to-video R@1
↕
text-to-video R@5
↕
text-to-video R@10
↕
text-to-video Mean Rank
↕
text-to-video Median Rank
↕
ModelName
ReleaseDate
↕
OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning
26.1
54.1
70.8
OmniVec2
2024-01-01
Multi-granularity Correspondence Learning from Long-term Noisy Videos
✓ Link
24.2
51.9
64.1
Norton
2024-01-30
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
✓ Link
22.7
50.4
63.1
VideoCLIP
2021-09-28
VideoCoCa: Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners
20.3
43.0
53.3
VideoCOca
2022-12-09
TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment
19.9
43.2
55.7
8
TACo
2021-08-23
HowToCaption: Prompting LLMs to Transform Video Annotations at Scale
✓ Link
19.7
43.6
53.9
8
VAST, HowToCaption-finetuned
2023-10-07
End-to-End Learning of Visual Representations from Uncurated Instructional Videos
✓ Link
15.1
38.0
51.2
10
MIL-NCE
2019-12-13
HowToCaption: Prompting LLMs to Transform Video Annotations at Scale
✓ Link
13.4
33.1
44.1
15
HowToCaption
2023-10-07
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
✓ Link
45.5
13
VATT-MBS
2021-04-22