Paper | Code | R@1,IoU=0.7 | R@1,IoU=0.5 | ModelName | ReleaseDate |
---|---|---|---|---|---|
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding | ✓ Link | 56.45 | 71.42 | InternVideo2-6B | 2024-03-22 |
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding | ✓ Link | 54.45 | 70.00 | InternVideo2-1B | 2024-03-22 |
Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval | ✓ Link | 49.94 | 66.73 | LLMEPET | 2024-07-21 |
Query-Dependent Video Representation for Moment Retrieval and Highlight Detection | ✓ Link | 44.98 | 62.40 | QD-DETR | 2023-03-24 |
DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and Highlight Detection | 44.49 | 61.61 | DiffusionVMR | 2023-08-29 | |
UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection | ✓ Link | 41.18 | 56.23 | UMT | 2022-03-23 |
Detecting Moments and Highlights in Videos via Natural Language Queries | ✓ Link | 33.02 | 52.89 | Moment-DETR | 2021-12-01 |