OpenCodePapers

video-question-answering-on-situated

Video Question Answering

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	Average Accuracy	ModelName	ReleaseDate
ViLA: Efficient Video-Language Alignment for Video Question Answering	✓ Link	67.1	VLAP (4 frames)	2023-12-13
Large Language Models are Temporal and Causal Reasoners for Video Question Answering	✓ Link	65.4	LLaMA-VQA	2023-10-24
Self-Chained Image-Language Model for Video Localization and Question Answering	✓ Link	64.9	SeViLA	2023-05-11
InternVideo: General Video Foundation Models via Generative and Discriminative Learning	✓ Link	58.7	InternVideo	2022-12-06
Glance and Focus: Memory Prompting for Multi-Event Video Question Answering	✓ Link	53.94	GF(sup)	2024-01-03
Glance and Focus: Memory Prompting for Multi-Event Video Question Answering	✓ Link	53.86	GF(uns)	2024-01-03
MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering	✓ Link	51.13	MIST	2022-12-19
Revisiting the "Video" in Video-Language Understanding	✓ Link	48.37	Temp[ATP]	2022-06-03
AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model	✓ Link	48.2	AnyMAL-70B (0-shot)	2023-09-27
All in One: Exploring Unified Video-Language Pre-training	✓ Link	47.5	All-in-one	2022-03-14
TraveLER: A Modular Multi-LMM Agent Framework for Video Question-Answering	✓ Link	44.9	TraveLER (0-shot)	2024-04-01
Self-Chained Image-Language Model for Video Localization and Question Answering	✓ Link	44.6	SeViLA (0-shot)	2023-05-11
Flamingo: a Visual Language Model for Few-Shot Learning	✓ Link	42.8	Flamingo-9B (4-shot)	2022-04-29
Flamingo: a Visual Language Model for Few-Shot Learning	✓ Link	42.4	Flamingo-80B (4-shot)	2022-04-29
Flamingo: a Visual Language Model for Few-Shot Learning	✓ Link	41.8	Flamingo-9B (0-shot)	2022-04-29
Flamingo: a Visual Language Model for Few-Shot Learning	✓ Link	39.7	Flamingo-80B (0-shot)	2022-04-29
Learning Situation Hyper-Graphs for Video Question Answering	✓ Link	39.47	SHG-VQA (trained from scratch)	2023-04-18