OpenCodePapers

visual-instruction-following-on-llava-bench

Instruction Followingvisual instruction following
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeavg scoreModelNameReleaseDate
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts✓ Link85.7CuMo-7B2024-05-09
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions✓ Link79.9ShareGPT4V-13B2023-11-21
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions✓ Link72.6ShareGPT4V-7B2023-11-21
Improved Baselines with Visual Instruction Tuning✓ Link70.7LLaVA-v1.5-13B2023-10-05
Improved Baselines with Visual Instruction Tuning✓ Link63.4LLaVA-v1.5-7B2023-10-05
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning✓ Link60.9InstructBLIP-7B2023-05-11
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning✓ Link58.2InstructBLIP-13B2023-05-11
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models✓ Link38.1BLIP-22023-01-30