OpenCodePapers

image-classification-on-inaturalist

Image Classification
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeTop 1 AccuracyTop 5 AccuracyTop 3 ErrorOverallModelNameReleaseDate
Multimodal Autoregressive Pre-training of Large Vision Encoders✓ Link85.9AIMv2-3B (448 res)2024-11-21
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles✓ Link83.8Hiera-H (448px)2023-06-01
Masked Autoencoders Are Scalable Vision Learners✓ Link83.4MAE (ViT-H, 448)2021-11-11
MetaFormer: A Unified Meta Framework for Fine-Grained Recognition✓ Link83.4%MetaFormer (MetaFormer-2,384,extra_info)2022-03-05
Multimodal Autoregressive Pre-training of Large Vision Encoders✓ Link81.5AIMv2-3B2024-11-21
ViT-NeT: Interpretable Vision Transformers with Neural Tree Decoder✓ Link81.2ViT-NeT (SwinV2-B)2022-07-17
MetaFormer: A Unified Meta Framework for Fine-Grained Recognition✓ Link80.4%MetaFormer (MetaFormer-2,384)2022-03-05
Multimodal Autoregressive Pre-training of Large Vision Encoders✓ Link79.7AIMv2-1B2024-11-21
Multimodal Autoregressive Pre-training of Large Vision Encoders✓ Link77.9AIMv2-H2024-11-21
Multimodal Autoregressive Pre-training of Large Vision Encoders✓ Link76AIMv2-L2024-11-21
Fixing the train-test resolution discrepancy✓ Link75.4FixSENet-1542019-06-14
On the Eigenvalues of Global Covariance Pooling for Fine-grained Visual Recognition✓ Link72.3SEB+EfficientNet-B52022-05-26
TransFG: A Transformer Architecture for Fine-grained Recognition✓ Link71.7TransFG2021-03-14
The iNaturalist Species Classification and Detection Dataset✓ Link67.3%87.5%IncResNetV2 SE2017-07-20
SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization✓ Link63.6%84.8%SpineNet-1432019-12-10
MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition✓ Link63.28%MetaSAug2021-03-23
Graph-RISE: Graph-Regularized Image Semantic Embedding✓ Link31.12%52.76%Graph-RISE (40M)2019-02-14
Deep CNNs Meet Global Covariance Pooling: Better Representation and Generalization✓ Link14.625iSQRT-COV-Net2019-04-15
DeiT-LT Distillation Strikes Back for Vision Transformer Training on Long-Tailed Datasets✓ Link75.1b_22DeiT-LT(ours)2024-04-03