OpenCodePapers

mmr-total-on-mrr-benchmark

MMR total

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	Total Column Score	ModelName	ReleaseDate
Claude 3.5 Sonnet Model Card Addendum		463	Claude 3.5 Sonnet	2024-06-24
GPT-4o: Visual perception performance of multimodal large language models in piglet activity understanding		457	GPT-4o	2024-06-14
The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)	✓ Link	415	GPT-4V	2023-09-29
Visual Instruction Tuning	✓ Link	412	LLaVA-NEXT-34B	2023-04-17
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone		397	Phi-3-Vision	2024-04-22
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks	✓ Link	368	InternVL2-8B	2023-12-21
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond	✓ Link	366	Qwen-vl-max	2023-08-24
Visual Instruction Tuning	✓ Link	335	LLaVA-NEXT-13B	2023-04-17
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond	✓ Link	310	Qwen-vl-plus	2023-08-24
What matters when building vision-language models?		256	Idefics-2-8B	2024-05-03
Visual Instruction Tuning	✓ Link	243	LLaVA-1.5-13B	2023-04-17
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks	✓ Link	237	InternVL2-1B	2023-12-21
Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models	✓ Link	214	Monkey-Chat-7B	2023-11-11
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents	✓ Link	139	Idefics-80B	2023-06-21