| Paper | Code | Acc@GQA | ModelName | ReleaseDate |
|---|---|---|---|---|
| Question-Answering Dense Video Events | ✓ Link | 28.9 | DeVi (Gemini 2.0) | 2024-09-06 |
| VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning | ✓ Link | 28.2 | VideoMind(7B) | 2025-03-17 |
| Question-Answering Dense Video Events | ✓ Link | 28.0 | DeVi (GPT-4) | 2024-09-06 |
| A Simple LLM Framework for Long-Range Video Question-Answering | ✓ Link | 26.8 | LLoVi (GPT-4) | 2023-12-28 |
| VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning | ✓ Link | 25.2 | VideoMind (2B) | 2025-03-17 |
| Streaming Long Video Understanding with Large Language Models | 17.8 | VideoStreaming | 2024-05-25 | |
| Language Repository for Long Video Understanding | ✓ Link | 17.1 | LangRepo (12B) | 2024-03-21 |
| A Simple LLM Framework for Long-Range Video Question-Answering | ✓ Link | 11.2 | LLoVi (7B) | 2023-12-28 |
| Mistral 7B | ✓ Link | 9.2 | Mistral (7B) | 2023-10-10 |