OpenCodePapers
zero-shot-transfer-image-classification-on-1
Zero-Shot Transfer Image Classification
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
Show papers without code
Paper
Code
Param
↕
Accuracy (Private)
↕
Accuracy (Public)
↕
ModelName
ReleaseDate
↕
M2-Encoder: Advancing Bilingual Image-Text Understanding by Large-scale Efficient Pretraining
✓ Link
10B
88.5
M2-Encoder
2024-01-29
[]()
88.3
BASIC (Lion)
CoCa: Contrastive Captioners are Image-Text Foundation Models
✓ Link
86.3
CoCa
2022-05-04
Scaling Vision Transformers to 22 Billion Parameters
✓ Link
85.9
LiT-22B
2023-02-10
Combined Scaling for Zero-shot Transfer Learning
85.7
BASIC
2021-11-19
PaLI: A Jointly-Scaled Multilingual Language-Image Model
✓ Link
85.4
LiT ViT-e
2022-09-14
LiT: Zero-Shot Transfer with Locked-image text Tuning
✓ Link
84.5
75.7
LiT-tuning
2021-11-15
Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception
83.9
IMP-MoE-L
2023-05-10
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
✓ Link
83.8
EVA-CLIP-18B
2024-02-06
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
✓ Link
83.2
InternVL-C
2023-12-21
The effectiveness of MAE pre-pretraining for billion-scale pretraining
✓ Link
82.1
MAWS (ViT-2B)
2023-03-23
EVA-CLIP: Improved Training Techniques for CLIP at Scale
✓ Link
82
EVA-CLIP-E/14+
2023-03-27
[]()
81.8
CLIPA (ViT-H/14-336px)
The effectiveness of MAE pre-pretraining for billion-scale pretraining
✓ Link
81.1
MAWS (ViT-H)
2023-03-23
Learning Customized Visual Models with Retrieval-Augmented Knowledge
✓ Link
78.5
REACT
2023-01-17
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
✓ Link
76.4
-
ALIGN
2021-02-11
Learning Transferable Visual Models From Natural Language Supervision
✓ Link
76.2
CLIP(ViT-L/14-336px)
2021-02-26
AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities
✓ Link
74.5
AltCLIP
2022-11-12
PaLI: A Jointly-Scaled Multilingual Language-Image Model
✓ Link
72.11
PaLI
2022-09-14
Your Diffusion Model is Secretly a Zero-Shot Classifier
✓ Link
61.4
Diffusion Classifier (zero-shot)
2023-03-28
Learning Transferable Visual Models From Natural Language Supervision
✓ Link
59.6
CLIP (ResNet50)
2021-02-26
[]()
76.5
CWCL
Learning Transferable Visual Models From Natural Language Supervision
✓ Link
31.3
CLIP
2021-02-26