OpenCodePapers
vcgbench-diverse-on-videoinstruct
VCGBench-Diverse
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
Show papers without code
Paper
Code
mean
↕
Correctness of Information
↕
Detail Orientation
↕
Contextual Understanding
↕
Temporal Understanding
↕
Consistency
↕
Dense Captioning
↕
Spatial Understanding
↕
Reasoning
↕
ModelName
ReleaseDate
↕
VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding
✓ Link
2.47
2.46
2.73
2.81
1.78
2.59
1.38
2.80
3.63
VideoGPT+
2024-06-13
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
✓ Link
2.29
2.29
2.56
2.66
1.56
2.36
1.33
2.36
3.59
Chat-UniVi
2023-11-14
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
✓ Link
2.20
2.13
2.42
2.51
1.66
2.27
1.26
2.43
3.13
VideoChat2
2023-11-28
BT-Adapter: Video Conversation is Feasible Without Video Instruction Tuning
✓ Link
2.19
2.20
2.62
2.59
1.29
2.27
1.03
2.35
3.62
BT-Adapter
2023-09-27
VTimeLLM: Empower LLM to Grasp Video Moments
✓ Link
2.17
2.16
2.41
2.48
1.46
2.35
1.13
2.29
3.45
VTimeLLM
2023-11-30
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models
✓ Link
2.08
2.07
2.42
2.46
1.39
2.06
0.89
2.25
3.60
Video-ChatGPT
2023-06-08