OpenCodePapers

visual-instruction-following-on-llava-bench

Instruction Followingvisual instruction following

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	avg score	ModelName	ReleaseDate
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts	✓ Link	85.7	CuMo-7B	2024-05-09
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions	✓ Link	79.9	ShareGPT4V-13B	2023-11-21
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions	✓ Link	72.6	ShareGPT4V-7B	2023-11-21
Improved Baselines with Visual Instruction Tuning	✓ Link	70.7	LLaVA-v1.5-13B	2023-10-05
Improved Baselines with Visual Instruction Tuning	✓ Link	63.4	LLaVA-v1.5-7B	2023-10-05
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning	✓ Link	60.9	InstructBLIP-7B	2023-05-11
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning	✓ Link	58.2	InstructBLIP-13B	2023-05-11
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models	✓ Link	38.1	BLIP-2	2023-01-30