Paper | Code | Acc@GQA | ModelName | ReleaseDate |
---|---|---|---|---|
Question-Answering Dense Video Events | ✓ Link | 28.9 | DeVi (Gemini 2.0) | 2024-09-06 |
VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning | ✓ Link | 28.2 | VideoMind(7B) | 2025-03-17 |
Question-Answering Dense Video Events | ✓ Link | 28.0 | DeVi (GPT-4) | 2024-09-06 |
A Simple LLM Framework for Long-Range Video Question-Answering | ✓ Link | 26.8 | LLoVi (GPT-4) | 2023-12-28 |
VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning | ✓ Link | 25.2 | VideoMind (2B) | 2025-03-17 |
Streaming Long Video Understanding with Large Language Models | 17.8 | VideoStreaming | 2024-05-25 | |
Language Repository for Long Video Understanding | ✓ Link | 17.1 | LangRepo (12B) | 2024-03-21 |
A Simple LLM Framework for Long-Range Video Question-Answering | ✓ Link | 11.2 | LLoVi (7B) | 2023-12-28 |
Mistral 7B | ✓ Link | 9.2 | Mistral (7B) | 2023-10-10 |