OpenCodePapers

zero-shot-transfer-image-classification-on-5

Zero-Shot Transfer Image Classification
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeAccuracy (Private)Accuracy (Public)ModelNameReleaseDate
CoCa: Contrastive Captioners are Image-Text Foundation Models✓ Link90.2CoCa2022-05-04
Scaling Vision Transformers to 22 Billion Parameters✓ Link90.1LiT-22B2023-02-10
PaLI: A Jointly-Scaled Multilingual Language-Image Model✓ Link88.0LiT ViT-e2022-09-14
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters✓ Link87.3EVA-CLIP-18B2024-02-06
[]()86.4BASIC (Lion)
Combined Scaling for Zero-shot Transfer Learning85.6BASIC2021-11-19
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks✓ Link83.8InternVL-C2023-12-21
EVA-CLIP: Improved Training Techniques for CLIP at Scale✓ Link82.1EVA-CLIP-E/14+2023-03-27
LiT: Zero-Shot Transfer with Locked-image text Tuning✓ Link79.4 37.8LiT-tuning2021-11-15
Learning Transferable Visual Models From Natural Language Supervision✓ Link77.2-CLIP2021-02-26
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision✓ Link75.8-ALIGN2021-02-11
AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities✓ Link69.5AltCLIP2022-11-12
PaLI: A Jointly-Scaled Multilingual Language-Image Model✓ Link44.7PaLI2022-09-14