OpenCodePapers
image-retrieval-on-flickr30k-cn
Image Retrieval
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
Show papers without code
Paper
Code
R@1
↕
R@10
↕
R@5
↕
ModelName
ReleaseDate
↕
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
✓ Link
85.9
97.1
98.7
InternVL-G-FT
2023-12-21
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
✓ Link
85.2
97.0
98.5
InternVL-C-FT
2023-12-21
Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese
✓ Link
84.4
98.7
97.1
CN-CLIP (ViT-L/14@336px)
2022-11-02
CCMB: A Large-scale Chinese Cross-modal Benchmark
✓ Link
84.4
98.4
96.7
R2D2 (ViT-L/14)
2022-05-08
Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese
✓ Link
83.8
98.6
96.9
CN-CLIP (ViT-H/14)
2022-11-02
Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese
✓ Link
82.7
98.6
96.7
CN-CLIP (ViT-L/14)
2022-11-02
Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese
✓ Link
79.1
97.4
94.8
CN-CLIP (ViT-B/16)
2022-11-02
CCMB: A Large-scale Chinese Cross-modal Benchmark
✓ Link
78.3
97.0
94.6
R2D2 (ViT-B)
2022-05-08
Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark
✓ Link
77.4
97.0
94.5
Wukong (ViT-L/14)
2022-02-14
Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark
✓ Link
67.6
94.2
89.6
Wukong (ViT-B/32)
2022-02-14
Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese
✓ Link
66.7
94.1
89.4
CN-CLIP (RN50)
2022-11-02