OpenCodePapers

image-classification-on-inaturalist-2019

Image Classification
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeTop-1 AccuracyNumber of paramsModelNameReleaseDate
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles✓ Link88.5Hiera-H (448px)2023-06-01
Masked Autoencoders Are Scalable Vision Learners✓ Link88.3MAE (ViT-H, 448)2021-11-11
Grafit: Learning fine-grained image representations with coarse labels84.1Grafit (RegnetY 8GF)2020-11-25
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers✓ Link83.9MixMIM-L2022-05-26
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs✓ Link83.7186MRDNet-L (224 res, IN-1K pretrained)2024-03-28
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs✓ Link83.587MRDNet-B (224 res, IN-1K pretrained)2024-03-28
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs✓ Link82.950MRDNet-S (224 res, IN-1K pretrained)2024-03-28
Conviformers: Convolutionally guided Vision Transformer✓ Link82.85Conviformer-B2022-08-17
Incorporating Convolution Designs into Visual Transformers✓ Link82.7CeiT-S (384 finetune resolution)2021-03-22
Going deeper with Image Transformers✓ Link81.8CaiT-M-36 U 2242021-03-31
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs✓ Link81.224MRDNet-T (224 res, IN-1K pretrained)2024-03-28
Incorporating Convolution Designs into Visual Transformers✓ Link78.9CeiT-S2021-03-22
Incorporating Convolution Designs into Visual Transformers✓ Link77.9CeiT-T (384 finetune resolution)2021-03-22
ResNet strikes back: An improved training procedure in timm✓ Link75.0ResNet50 (A2)2021-10-01
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference✓ Link74.3LeViT-3842021-04-02
Incorporating Convolution Designs into Visual Transformers✓ Link72.8CeiT-T2021-03-22
ResMLP: Feedforward networks for image classification with data-efficient training✓ Link72.5ResMLP-242021-05-07
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference✓ Link72.3LeViT-2562021-04-02
ResMLP: Feedforward networks for image classification with data-efficient training✓ Link71.0ResMLP-122021-05-07
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference✓ Link70.8LeViT-1922021-04-02
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference✓ Link68.4LeViT-1282021-04-02
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference✓ Link66.5LeViT-128S2021-04-02