OpenCodePapers

image-text-matching-on-commercialadsdataset

Cross-Modal RetrievalImage-text matching
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeADD(S) AUCModelNameReleaseDate
Align before Search: Aligning Ads Image to Text for Accurate Cross-Modal Sponsored Search✓ Link91.73AlignCMSS2023-09-28
VinVL: Revisiting Visual Representations in Vision-Language Models✓ Link88.56VinVL2021-01-02
AdsCVLR: Commercial Visual-Linguistic Representation Modeling in Sponsored Search87.90AdsCVLR2022-10-10
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks✓ Link87.45OSCAR2020-04-13
VL-BERT: Pre-training of Generic Visual-Linguistic Representations✓ Link86.27VL-BERT2019-08-22
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation✓ Link83.51BLIP2022-01-28
Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal Pre-training83.16Unicoder-VL2019-08-16
Align before Fuse: Vision and Language Representation Learning with Momentum Distillation✓ Link82.74ALBEF2021-07-16