Paper | Code | Recall | ModelName | ReleaseDate |
---|---|---|---|---|
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding | ✓ Link | 47.3 | VideoCLIP | 2021-09-28 |
VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding | ✓ Link | 46.5 | VLM | 2021-05-20 |
TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment | 42.5 | TACo | 2021-08-23 | |
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips | ✓ Link | 33.6 | Text-Video Embedding | 2019-06-07 |
Cross-task weakly supervised learning from instructional videos | ✓ Link | 31.6 | Fully-supervised upper-bound | 2019-03-19 |
Cross-task weakly supervised learning from instructional videos | ✓ Link | 22.4 | Zhukov | 2019-03-19 |
Unsupervised Learning from Narrated Instruction Videos | 13.3 | Alayrac | 2015-06-30 |