OpenCodePapers

image-sentence-alignment-on-valse-foil-it

Multimodal Text and Image Classificationimage-sentence alignment
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodepairwise accuracyAccuracy (%)ModelNameReleaseDate
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena✓ Link88.8CLIP2021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena✓ Link87.170.8LXMERT2021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena✓ Link86.971.5ViLBERT 12-in-12021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena✓ Link86.955.9ViLBERT2021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena✓ Link80.7GPT22021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena✓ Link77.5GPT12021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena✓ Link48.546.6VisualBERT2021-12-14