OpenCodePapers

image-classification-on-imagenet-real

Image Classification
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeAccuracyParamsTop 1 AccuracyNumber of paramsModelNameReleaseDate
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time✓ Link91.78%Baseline (ViT-G/14)2022-03-10
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond✓ Link91.2%644MViTAE-H (MAE, 512)2022-02-21
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time✓ Link91.20%1843MModel soups (ViT-G/14)2022-03-10
Meta Pseudo Labels✓ Link91.12%Meta Pseudo Labels (EfficientNet-B6-Wide)2020-03-23
The effectiveness of MAE pre-pretraining for billion-scale pretraining✓ Link91.1%MAWS (ViT-6.5B)2023-03-23
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?✓ Link91.05%460MTokenLearner L/8 (24+11)2021-06-21
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time✓ Link91.03%2440MModel soups (BASIC-L)2022-03-10
Meta Pseudo Labels✓ Link91.02%Meta Pseudo Labels (EfficientNet-L2)2020-03-23
The effectiveness of MAE pre-pretraining for billion-scale pretraining✓ Link90.9%MAWS (ViT-2B)2023-03-23
Fixing the train-test resolution discrepancy: FixEfficientNet✓ Link90.9%480MFixEfficientNet-L22020-03-18
Scaling Vision Transformers✓ Link90.81%ViT-G/142021-06-08
The effectiveness of MAE pre-pretraining for billion-scale pretraining✓ Link90.8%MAWS (ViT-H)2023-03-23
Revisiting Weakly Supervised Pre-Training of Visual Perception Models✓ Link90.7%SWAG (RegNetY 128GF)2022-01-20
CvT: Introducing Convolutions to Vision Transformers✓ Link90.6%87.7%277MCvT-W24 (384 res, ImageNet-22k pretrain)2021-03-29
VOLO: Vision Outlooker for Visual Recognition✓ Link90.6%VOLO-D52021-06-24
Self-training with Noisy Student improves ImageNet classification✓ Link90.55%480MEfficientNet-L22019-11-11
Big Transfer (BiT): General Visual Representation Learning✓ Link90.54%928MBiT-L2019-12-24
VOLO: Vision Outlooker for Visual Recognition✓ Link90.5%VOLO-D42021-06-24
Going deeper with Image Transformers✓ Link90.2%CAIT-M36-4482021-03-31
MLP-Mixer: An all-MLP Architecture for Vision✓ Link90.18%409MMixer-H/14- 448 (JFT-300M pre-train)2021-05-04
Fixing the train-test resolution discrepancy: FixEfficientNet✓ Link90.0%87MFixEfficientNet-B82020-03-18
Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision✓ Link89.8%10000MSEER (RegNet10B)2022-02-16
Fixing the train-test resolution discrepancy✓ Link89.73%829MFixResNeXt-101 32x48d2019-06-14
Training data-efficient image transformers & distillation through attention✓ Link89.3%86MDeiT-B-3842020-12-23
Big Transfer (BiT): General Visual Representation Learning✓ Link89.02%BiT-M2019-12-24
Training data-efficient image transformers & distillation through attention✓ Link88.7%86MDeiT-B2020-12-23
Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network✓ Link88.65%Assemble-ResNet1522020-01-17
Incorporating Convolution Designs into Visual Transformers✓ Link88.1%CeiT-S (384 finetune res)2021-03-22
Sequencer: Deep LSTM for Image Classification✓ Link87.9Sequencer2D-L2022-05-04
MLP-Mixer: An all-MLP Architecture for Vision✓ Link87.86%409MMixer-H/14 (JFT-300M pre-train)2021-05-04
Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network✓ Link87.82%Assemble ResNet-502020-01-17
Learning Transferable Architectures for Scalable Image Recognition✓ Link87.56%NASNet-A Large2017-07-21
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference✓ Link87.5%LeViT-3842021-04-02
Incorporating Convolution Designs into Visual Transformers✓ Link87.3%CeiT-S2021-03-22
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference✓ Link86.9%LeViT-2562021-04-02
Training data-efficient image transformers & distillation through attention✓ Link86.8%22MDeiT-S2020-12-23
When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations✓ Link86.4%ResNet-152x2-SAM2021-06-03
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference✓ Link85.8%LeViT-1922021-04-02
ResNet strikes back: An improved training procedure in timm✓ Link85.7%25MResNet50 (A1)2021-10-01
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference✓ Link85.6%LeViT-1282021-04-02
ResMLP: Feedforward networks for image classification with data-efficient training✓ Link85.6%45MResMLP-362021-05-07
ResMLP: Feedforward networks for image classification with data-efficient training✓ Link85.3%30MResMLP-242021-05-07
When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations✓ Link85.2%ViT-B/16-SAM2021-06-03
ResMLP: Feedforward networks for image classification with data-efficient training✓ Link84.6%15MResMLP-122021-05-07
When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations✓ Link84.4%Mixer-B/8-SAM2021-06-03
Revisiting a kNN-based Image Classification System with High-capacity Storage84%kNN-CLIP2022-04-03
Incorporating Convolution Designs into Visual Transformers✓ Link83.6%CeiT-T2021-03-22
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference✓ Link82.6%LeViT-128S2021-04-02
Training data-efficient image transformers & distillation through attention✓ Link82.1%5MDeiT-Ti2020-12-23
Learning Transferable Architectures for Scalable Image Recognition✓ Link81.15%NASNet-A Mobile2017-07-21
Very Deep Convolutional Networks for Large-Scale Image Recognition✓ Link80.60%VGG-16 BN2014-09-04
Very Deep Convolutional Networks for Large-Scale Image Recognition✓ Link79.01%VGG-162014-09-04
ImageNet Classification with Deep Convolutional Neural Networks✓ Link62.88%AlexNet2012-12-01
DeiT III: Revenge of the ViT✓ Link87.7%304MViT-L @384 (DeiT III, 21k)2022-04-14
DeiT III: Revenge of the ViT✓ Link87.2%632MViT-H @224 (DeiT III, 21k)2022-04-14
DeiT III: Revenge of the ViT✓ Link87.0%ViT-L @224 (DeiT III, 21k)2022-04-14
ResMLP: Feedforward networks for image classification with data-efficient training✓ Link84.4%ResMLP-B24/8 (22k)2021-05-07