OpenCodePapers

video-question-answering-on-tvqa

Video Question Answering
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeAccuracyModelNameReleaseDate
Large Language Models are Temporal and Causal Reasoners for Video Question Answering✓ Link82.2LLaMA-VQA2023-10-24
Zero-Shot Video Question Answering via Frozen Bidirectional Language Models✓ Link82FrozenBiLM2022-06-16
VindLU: A Recipe for Effective Video-and-Language Pretraining✓ Link79.0VindLU2022-12-09
iPerceive: Applying Common-Sense Reasoning to Multi-Modal Dense Video Captioning and Video Question Answering76.96iPerceive (Chadha et al., 2020)2020-11-16
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training✓ Link74.24Hero w/ pre-training2020-05-01
TVQA+: Spatio-Temporal Grounding for Video Question Answering✓ Link70.50STAGE (Lei et al., 2019)2019-04-25