OpenCodePapers

image-sentence-alignment-on-valse-plurality

Multimodal Text and Image Classificationimage-sentence alignment
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodepairwise accuracyAccuracy (%)ModelNameReleaseDate
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena✓ Link72.462.0ViLBERT 12-in-12021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena✓ Link64.455.1LXMERT2021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena✓ Link61.250.3ViLBERT2021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena✓ Link56.2CLIP2021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena✓ Link53.1GPT12021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena✓ Link51.9GPT22021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena✓ Link45.746.5VisualBERT2021-12-14