OpenCodePapers

zero-shot-video-retrieval-on-msvd

Zero-Shot Video Retrieval
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodetext-to-video R@1text-to-video R@5text-to-video R@10text-to-video Median Ranktext-to-video Mean Rankvideo-to-text R@1video-to-text R@5video-to-text R@10video-to-text Median RankModelNameReleaseDate
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding✓ Link59.384.489.683.194.297.0InternVideo2-6B2024-03-22
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding✓ Link58.183.088.483.394.396.9InternVideo2-1B2024-03-22
HowToCaption: Prompting LLMs to Transform Video Annotations at Scale✓ Link54.880.987.21VAST, HowToCaption-finetuned2023-10-07
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment✓ Link54.181.188.11.069.791.897.91.0LanguageBind(ViT-L/14)2023-10-03
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment✓ Link53.980.487.8172.091.496.31LanguageBind(ViT-H/14)2023-10-03
vid-TLDR: Training Free Token merging for Light-weight Video Transformer✓ Link50.077.685.575.790.095.1vid-TLDR (UMT-L)2024-03-20
Unmasked Teacher: Towards Training-Efficient Video Foundation Models✓ Link49.076.984.774.589.792.8UMT-L (ViT-L/16)2023-03-28
HowToCaption: Prompting LLMs to Transform Video Annotations at Scale✓ Link44.573.382.12HowToCaption2023-10-07
MILES: Visual BERT Pre-training with Injected Language Semantics for Video-text Retrieval✓ Link44.476.287.02.0MILES2022-04-26
Bridging Video-text Retrieval with Multiple Choice Questions✓ Link43.674.984.92.0Y. Ge et. al.2022-01-13
InternVideo: General Video Foundation Models via Generative and Discriminative Learning✓ Link43.467.6InternVideo2022-12-06
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval✓ Link38.566.976.8217.8CLIP4Clip2021-04-18
LaT: Latent Translation with Cycle-Consistency for Video-Text Retrieval36.968.681.0234.469.079.23LaT2022-07-11
Noise Estimation Using Density Estimation for Self-Supervised Multimodal Learning✓ Link13.6635.747.74SSML2020-03-06