OpenCodePapers
zero-shot-video-question-answer-on-intentqa
Video Question Answering
Zero-Shot Video Question Answer
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
Show papers without code
Paper
Code
Accuracy
↕
ModelName
ReleaseDate
↕
ENTER: Event Based Interpretable Reasoning for VideoQA
71.5
ENTER
2025-01-24
Too Many Frames, Not All Useful: Efficient Strategies for Long-Form Video QA
✓ Link
71.1
LVNet
2024-06-13
TS-LLaVA: Constructing Visual Tokens through Thumbnail-and-Sampling for Training-Free Video Large Language Models
✓ Link
67.9
TS-LLaVA-34B
2024-11-17
VidCtx: Context-aware Video Question Answering with Image Models
✓ Link
67.1
VidCtx (7B)
2024-12-23
VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos
✓ Link
66.9
VideoTree (GPT4)
2024-05-29
An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLM
✓ Link
65.3
IG-VLM
2024-03-27
A Simple LLM Framework for Long-Range Video Question-Answering
✓ Link
64.0
LLoVi (GPT-4)
2023-12-28
Self-Chained Image-Language Model for Video Localization and Question Answering
✓ Link
60.9
SeViLA (4B)
2023-05-11
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models
✓ Link
60.1
SlowFast-LLaVA-34B
2024-07-22
Language Repository for Long Video Understanding
✓ Link
59.1
LangRepo (12B)
2024-03-21
A Simple LLM Framework for Long-Range Video Question-Answering
✓ Link
53.6
LLoVi (7B)
2023-12-28
Mistral 7B
✓ Link
50.4
Mistral (7B)
2023-10-10
[]()
20.0
Random