OpenCodePapers

image-sentence-alignment-on-valse

Multimodal Text and Image Classificationimage-sentence alignment
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeaverage pairwise accuracyAverage AccuracyModelNameReleaseDate
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena✓ Link75.163.2ViLBERT 12-in-12021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena✓ Link64.0CLIP2021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena✓ Link63.751.3ViLBERT2021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena✓ Link60.7GPT12021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena✓ Link60.1GPT22021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena✓ Link59.653.5LXMERT2021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena✓ Link46.448.8VisualBERT2021-12-14