OpenCodePapers
image-sentence-alignment-on-valse
Multimodal Text and Image Classification
image-sentence alignment
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
Show papers without code
Paper
Code
average pairwise accuracy
↕
Average Accuracy
↕
ModelName
ReleaseDate
↕
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena
✓ Link
75.1
63.2
ViLBERT 12-in-1
2021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena
✓ Link
64.0
CLIP
2021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena
✓ Link
63.7
51.3
ViLBERT
2021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena
✓ Link
60.7
GPT1
2021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena
✓ Link
60.1
GPT2
2021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena
✓ Link
59.6
53.5
LXMERT
2021-12-14
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena
✓ Link
46.4
48.8
VisualBERT
2021-12-14