self-supervised-image-classification-on

Self-Supervised Image Classification

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	Top 1 Accuracy	Top 5 Accuracy	Number of Params	ModelName	ReleaseDate
Vision Transformers Need Registers	✓ Link	87.1		1100M	DINOv2+reg (ViT-g/14)	2023-09-28
DINOv2: Learning Robust Visual Features without Supervision	✓ Link	86.7%		1100M	DINOv2 (ViT-g/14 @448)	2023-04-14
DINOv2: Learning Robust Visual Features without Supervision	✓ Link	86.5%		1100M	DINOv2 (ViT-g/14)	2023-04-14
DINOv2: Learning Robust Visual Features without Supervision	✓ Link	86.3%		307M	DINOv2 distilled (ViT-L/14)	2023-04-14
MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations	✓ Link	84.7%		632M	MIM-Refiner (D2V2-ViT-H/14)	2024-02-15
MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations	✓ Link	84.5%		1890M	MIM-Refiner (MAE-ViT-2B/14)	2024-02-15
DINOv2: Learning Robust Visual Features without Supervision	✓ Link	84.5%		85M	DINOv2 distilled (ViT-B/14)	2023-04-14
MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations	✓ Link	83.7%		632M	MIM-Refiner (MAE-ViT-H/14	2024-02-15
MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations	✓ Link	83.5%		307M	MIM-Refiner (D2V2-ViT-L/16)	2024-02-15
MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations	✓ Link	82.8%		307M	MIM-Refiner (MAE-ViT-L/16)	2024-02-15
iBOT: Image BERT Pre-Training with Online Tokenizer	✓ Link	82.3%		307M	iBOT (ViT-L/16) (IN22k)	2021-11-15
Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget	✓ Link	82.2%		632M	MAE-CT (ViT-H/16)	2023-04-20
Mugs: A Multi-Granular Self-Supervised Learning Framework	✓ Link	82.1%		307M	Mugs (VIT-L/16)	2022-03-27
Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget	✓ Link	81.5%		307M	MAE-CT (ViT-L/16	2023-04-20
Efficient Self-supervised Vision Transformers for Representation Learning	✓ Link	81.3	95.5	87M	EsViT (Swin-B)	2021-06-17
iBOT: Image BERT Pre-Training with Online Tokenizer	✓ Link	81.3%		307M	iBOT (ViT-L/16)	2021-11-15
DINOv2: Learning Robust Visual Features without Supervision	✓ Link	81.1%		21M	DINOv2 distilled (ViT-S/14)	2023-04-14
An Empirical Study of Training Self-Supervised Vision Transformers	✓ Link	81.0%		304M	MoCo v3 (ViT-BN-L/7)	2021-04-05
Efficient Self-supervised Vision Transformers for Representation Learning	✓ Link	80.8		49M	EsViT(Swin-S)	2021-06-17
Masked Siamese Networks for Label-Efficient Learning	✓ Link	80.7%		306M	MSN (ViT-L/7)	2022-04-14
Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?	✓ Link	80.6%		250M	ReLICv2 (ResNet-200 x2)	2022-01-13
Masked Reconstruction Contrastive Learning with Information Bottleneck Principle		80.4%			MR BarTwins (MR BarTwins)	2022-11-15
Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective	✓ Link	80.3%		732M	DiGIT	2024-10-16
DINO as a von Mises-Fisher mixture model		80.3%		85M	iBOT-vMF (ViT-B/16)	2024-05-17
Emerging Properties in Self-Supervised Vision Transformers	✓ Link	80.3%		84M	DINO (xcit_medium_24_p8)	2021-04-29
Perceptual Group Tokenizer: Building Perception with Iterative Grouping		80.3%		70M	PGT (PGT-B w/ Flow)	2023-11-30
Emerging Properties in Self-Supervised Vision Transformers	✓ Link	80.1%		80M	DINO (ViT-B/8)	2021-04-29
Big Self-Supervised Models are Strong Semi-Supervised Learners	✓ Link	79.8%	94.9%	795M	SimCLRv2 (ResNet-152 x3, SK)	2020-06-17
Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision	✓ Link	79.8%		10000M	SEERv2	2022-02-16
Improving Visual Representation Learning through Perceptual Understanding	✓ Link	79.8%		80M	PercMAE (ViT-B, dVAE)	2022-12-30
Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?	✓ Link	79.8%		63M	ReLICv2 (ResNet200)	2022-01-13
Emerging Properties in Self-Supervised Vision Transformers	✓ Link	79.7%		21M	DINO (ViT-S/8)	2021-04-29
Bootstrap your own latent: A new approach to self-supervised Learning	✓ Link	79.6%	94.8%	250M	BYOL (ResNet-200 x2)	2020-06-13
Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?	✓ Link	79.4%		375M	ReLICv2 (ResNet-50 4x)	2022-01-13
Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?	✓ Link	79.3%		58M	ReLICv2 (ResNet152)	2022-01-13
Unsupervised Representation Learning by Balanced Self Attention Matching	✓ Link	79.3%			BAM (CAFormer-M36)	2024-08-04
An Empirical Study of Training Self-Supervised Vision Transformers	✓ Link	79.1%		700M	MoCo v3 (ViT-BN-H)	2021-04-05
Unicom: Universal and Compact Representation Learning for Image Retrieval	✓ Link	79.1%		80M	Unicom (ViT-B/16)	2023-04-12
Unsupervised Visual Representation Learning by Synchronous Momentum Grouping	✓ Link	79.0%	94.4	375M	SMoG (ResNet-50 x4)	2022-07-13
Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?	✓ Link	79%		94M	ReLICv2 (ResNet-50 x2)	2022-01-13
Compressive Visual Representations	✓ Link	78.8%	94.5%	94M	C-BYOL (ResNet-50 2x, 1000 epochs)	2021-09-27
DINO as a von Mises-Fisher mixture model		78.8%		85M	DINO-vMF (ViT-B/16)	2024-05-17
Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?	✓ Link	78.7%		44M	ReLICv2 (ResNet101)	2022-01-13
Bootstrap your own latent: A new approach to self-supervised Learning	✓ Link	78.6%	94.2%	375M	BYOL (ResNet-50 x4)	2020-06-13
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments	✓ Link	78.5%		586M	SwAV (ResNet-50 x5)	2020-06-17
Emerging Properties in Self-Supervised Vision Transformers	✓ Link	78.2%		85M	DINO (ViT-B/16)	2021-04-29
An Empirical Study of Training Self-Supervised Vision Transformers	✓ Link	78.1%		632M	MoCo v3 (ViT-H)	2021-04-05
Improving Visual Representation Learning through Perceptual Understanding	✓ Link	78.1%		80M	PercMAE (ViT-B)	2022-12-30
Unsupervised Representation Learning by Balanced Self Attention Matching	✓ Link	78.1%		80M	BAM (ViT-B/16)	2024-08-04
Unsupervised Visual Representation Learning by Synchronous Momentum Grouping	✓ Link	78.0%	93.9	94M	SMoG (ResNet-50 x2)	2022-07-13
An Empirical Study of Training Self-Supervised Vision Transformers	✓ Link	77.6%		307M	MoCo v3 (ViT-L)	2021-04-05
Self-supervised Pretraining of Visual Features in the Wild	✓ Link	77.5%		1300M	SEER	2021-03-02
Bootstrap your own latent: A new approach to self-supervised Learning	✓ Link	77.4%	93.6%	94M	BYOL (ResNet-50 x2)	2020-06-13
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments	✓ Link	77.3%		94M	SwAV (ResNet-50 x2)	2020-06-17
Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?	✓ Link	77.1%		25M	ReLICv2 (ResNet-50)	2022-01-13
Emerging Properties in Self-Supervised Vision Transformers	✓ Link	77.0%		21M	DINO (ViT-S/16)	2021-04-29
DINO as a von Mises-Fisher mixture model		77.0%		21M	DINO-vMF (ViT-S/16)	2024-05-17
An Empirical Study of Training Self-Supervised Vision Transformers	✓ Link	76.7%		86M	MoCo v3 (ViT-B/16)	2021-04-05
Masked Autoencoders Are Scalable Vision Learners	✓ Link	76.6%		700M	MAE (ViT-H)	2021-11-11
A Simple Framework for Contrastive Learning of Visual Representations	✓ Link	76.5%	93.2%	375M	SimCLR (ResNet-50 4x)	2020-02-13
Unsupervised Visual Representation Learning by Online Constrained K-Means	✓ Link	76.4%		25M	CoKe (ResNet-50)	2021-05-24
Unsupervised Visual Representation Learning by Synchronous Momentum Grouping	✓ Link	76.4%		25M	SMoG (ResNet-50)	2022-07-13
Weak Augmentation Guided Relational Self-Supervised Learning	✓ Link	76.3%		24M	ReSSL (ResNet-50 w/ Predictor and Stronger Aug)	2022-03-16
Weak Augmentation Guided Relational Self-Supervised Learning	✓ Link	76.0%		24M	ReSSL (ResNet-50 w/ Predictor)	2022-03-16
Solving Inefficiency of Self-supervised Representation Learning	✓ Link	75.9%		23.56M	Triplet (ResNet-50)	2021-04-18
Masked Autoencoders Are Scalable Vision Learners	✓ Link	75.8%		306M	MAE (ViT-L)	2021-11-11
Divide and Contrast: Self-supervised Learning from Uncurated Data		75.8%		24M	DnC (ResNet-50)	2021-05-17
CaCo: Both Positive and Negative Samples are Directly Learnable via Cooperative-adversarial Contrastive Learning	✓ Link	75.7%		24M	CaCo (ResNet-50)	2022-03-27
Big Self-Supervised Models are Strong Semi-Supervised Learners	✓ Link	75.6%	92.7%	94M	SimCLRv2 (ResNet-50 x2)	2020-06-17
Compressive Visual Representations	✓ Link	75.6%	92.7%	25M	C-BYOL (ResNet-50, 1000 epochs)	2021-09-27
With a Little Help from My Friends: Nearest-Neighbor Contrastive Learning of Visual Representations	✓ Link	75.6%	92.4	25M	NNCLR (ResNet-50, multi-crop)	2021-04-29
Self-supervised Pre-training with Hard Examples Improves Visual Representations		75.5%		24M	HEXA	2020-12-25
Similarity Contrastive Estimation for Self-Supervised Soft Contrastive Learning	✓ Link	75.4%		24M	SCE (ResNet-50, multi-crop)	2021-11-29
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments	✓ Link	75.3%		24M	SwAV (ResNet-50)	2020-06-17
Emerging Properties in Self-Supervised Vision Transformers	✓ Link	75.3%		24M	DINO (ResNet-50)	2021-04-29
What Makes for Good Views for Contrastive Learning?	✓ Link	75.2%		120M	InfoMin (ResNeXt-152)	2020-05-20
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments	✓ Link	75.2%		24M	DeepCluster-v2 (ResNet-50)	2020-06-17
Unicom: Universal and Compact Representation Learning for Image Retrieval	✓ Link	75.0%		80M	Unicom (ViT-B/32)	2023-04-12
Self-Supervised Learning with Swin Transformers	✓ Link	75%		29M	MoBY (Swin-T)	2021-05-10
Representation Learning via Invariant Causal Mechanisms	✓ Link	74.8%		24M	ReLIC (ResNet-50)	2020-10-15
ReSSL: Relational Self-Supervised Learning with Weak Augmentation	✓ Link	74.7%	92.3%	24M	ReSSL(ResNet-50) 200ep	2021-07-20
Weakly Supervised Contrastive Learning	✓ Link	74.7%		24M	WCL (ResNet-50)	2021-10-10
MV-MR: multi-views and multi-representations for self-supervised learning and knowledge distillation	✓ Link	74.5%	92.1		MV-MR	2023-03-21
Boosting Contrastive Self-Supervised Learning with False Negative Cancellation	✓ Link	74.4%	91.8%	24M	FNC (ResNet-50)	2020-11-23
Bootstrap your own latent: A new approach to self-supervised Learning	✓ Link	74.3%	91.6%	24M	BYOL (ResNet-50)	2020-06-13
A Simple Framework for Contrastive Learning of Visual Representations	✓ Link	74.2%	92.0%	94M	SimCLR (ResNet-50 2x)	2020-02-13
Self-Supervised Classification Network	✓ Link	74.2%		24M	Self-Classifier (ResNet-50)	2021-03-19
Learning by Sorting: Self-supervised Learning with Group Ordering Constraints	✓ Link	73.9%	91.6	25M	GroCo (ResNet-50)	2023-01-05
OBoW: Online Bag-of-Visual-Words Generation for Self-Supervised Learning	✓ Link	73.8%	92.2%	24M	OBoW (ResNet-50)	2020-12-21
VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning	✓ Link	73.2	91.1	24M	VICReg (ResNet50)	2021-05-11
Barlow Twins: Self-Supervised Learning via Redundancy Reduction	✓ Link	73.2%	91	24M	Barlow Twins (ResNet-50)	2021-03-04
What Makes for Good Views for Contrastive Learning?	✓ Link	73.0%	91.1%	24M	InfoMin (ResNet-50)	2020-05-20
ResMLP: Feedforward networks for image classification with data-efficient training	✓ Link	72.8%		30M	DINO (ResMLP-24)	2021-05-07
Self-Supervised Learning with Swin Transformers	✓ Link	72.8%		22M	MoBY (DeiT-S)	2021-05-10
VNE: An Effective Method for Improving Deep Representation by Manipulating Eigenvalue Distribution	✓ Link	72.1	91.0	25M	I-VNE+ (ResNet-50)	2023-04-04
Generative Pretraining from Pixels	✓ Link	72.0%		6801M	iGPT-XL (64x64, 15360 features)	2020-07-17
Big Self-Supervised Models are Strong Semi-Supervised Learners	✓ Link	71.7%	90.4%	24M	SimCLRv2 (ResNet-50)	2020-06-17
Data-Efficient Image Recognition with Contrastive Predictive Coding	✓ Link	71.5%	90.1%	305M	CPC v2 (ResNet-161) (arxiv v2)	2019-05-22
Exploring Simple Siamese Representation Learning	✓ Link	71.3%		24M	SimSiam (ResNet-50)	2020-11-20
Improved Baselines with Momentum Contrastive Learning	✓ Link	71.1%	90.1%	24M	MoCo v2 (ResNet-50)	2020-03-09
SynCo: Synthetic Hard Negatives in Contrastive Learning for Better Unsupervised Visual Representations	✓ Link	70.6%	89.8%	24M	SynCo (ResNet-50) 800ep	2024-10-03
Contrastive Multiview Coding	✓ Link	70.6%	89.7%	188M	CMC (ResNet-50 x2) (arxiv v5)	2019-06-13
A Simple Framework for Contrastive Learning of Visual Representations	✓ Link	69.3%	89.0%	24M	SimCLR (ResNet-50)	2020-02-13
Generative Pretraining from Pixels	✓ Link	68.7%		6800M	iGPT-XL (64x64, 3072 features)	2020-07-17
Momentum Contrast for Unsupervised Visual Representation Learning	✓ Link	68.6%		375M	MoCo (ResNet-50 4x)	2019-11-13
Learning Representations by Maximizing Mutual Information Across Views	✓ Link	68.1%		626M	AMDIM (large) (arxiv v2)	2019-06-03
Masked Autoencoders Are Scalable Vision Learners	✓ Link	68.0%		80M	MAE (ViT-B)	2021-11-11
SynCo: Synthetic Hard Negatives in Contrastive Learning for Better Unsupervised Visual Representations	✓ Link	67.9%	88	24M	SynCo (ResNet-50) 200ep	2024-10-03
ResMLP: Feedforward networks for image classification with data-efficient training	✓ Link	67.5%		15M	DINO (ResMLP-12)	2021-05-07
Contrastive Multiview Coding	✓ Link	66.2%	87.0%	47M	CMC (ResNet-50) (arxiv v5)	2019-06-13
Prototypical Contrastive Learning of Unsupervised Representations	✓ Link	65.9%		25M	PCL (ResNet-50)	2020-05-11
Momentum Contrast for Unsupervised Visual Representation Learning	✓ Link	65.4%		94M	MoCo (ResNet-50 2x)	2019-11-13
Generative Pretraining from Pixels	✓ Link	65.2%		1400M	iGPT-L (48x48)	2020-07-17
Contrastive Multiview Coding	✓ Link	65.0%	86.0%		CMC (ResNet-101) (arxiv v3)	2019-06-13
Data-Efficient Image Recognition with Contrastive Predictive Coding	✓ Link	63.8%	85.3%	24M	CPC v2 (ResNet-50) (arxiv v2)	2019-05-22
Max-Margin Contrastive Learning	✓ Link	63.8%			MMCL (100 epoch, 256 batch size)	2021-12-21
Self-Supervised Learning of Pretext-Invariant Representations	✓ Link	63.6%		24M	PIRL	2019-12-04
Learning Representations by Maximizing Mutual Information Across Views	✓ Link	63.5%		194M	AMDIM (small) (arxiv v2)	2019-06-03
Self-labelling via simultaneous clustering and representation learning	✓ Link	61.5%	84.0%	24M	SeLa (ResNet50) (arxiv 3)	2019-11-13
Large Scale Adversarial Representation Learning	✓ Link	61.3%	81.9%	86M	BigBiGAN (RevNet-50 ×4, BN+CReLU)	2019-07-04
Data-Efficient Image Recognition with Contrastive Predictive Coding	✓ Link	61.0%	83.0%	305M	CPC v2 (ResNet-161) (arxiv v1)	2019-05-22
Large Scale Adversarial Representation Learning	✓ Link	60.8%	81.4%	86M	BigBiGAN (RevNet-50 ×4)	2019-07-04
Momentum Contrast for Unsupervised Visual Representation Learning	✓ Link	60.6%		24M	MoCo (ResNet-50)	2019-11-13
Generative Pretraining from Pixels	✓ Link	60.3%		1400M	iGPT-L (32x32)	2020-07-17
Learning Representations by Maximizing Mutual Information Across Views	✓ Link	60.2%		337M	AMDIM (arxiv v1)	2019-06-03
Local Aggregation for Unsupervised Learning of Visual Embeddings	✓ Link	60.2%		24M	LocalAgg (ResNet-50)	2019-03-29
Contrastive Multiview Coding	✓ Link	60.1%	82.8%	44M	CMC (ResNet-101)	2019-06-13
Large Scale Adversarial Representation Learning	✓ Link	56.6%	78.6%	24M	BigBiGAN (ResNet-50, BN+CReLU)	2019-07-04
Self-labelling via simultaneous clustering and representation learning	✓ Link	55.7%	79.5%	24M	SeLa (ResNet50)	2019-11-13
Revisiting Self-Supervised Visual Representation Learning	✓ Link	55.4%	77.9%	86M	Revisited Rotation (RevNet-50 ×4)	2019-01-25
Large Scale Adversarial Representation Learning	✓ Link	55.4%	77.4%	25M	BigBiGAN (ResNet-50)	2019-07-04
Revisiting Self-Supervised Visual Representation Learning	✓ Link	51.4%	74.0%	94M	Revisited Rel.Patch.Loc (ResNet50 ×2)	2019-01-25
Self-labelling via simultaneous clustering and representation learning	✓ Link	50.0%		61M	SeLa (AlexNet) (arxiv v3)	2019-11-13
Representation Learning with Contrastive Predictive Coding	✓ Link	48.7%	73.6%	44M	CPC (ResNet-101 V2)	2018-07-10
Revisiting Self-Supervised Visual Representation Learning	✓ Link	46.0%	68.8%	211M	Revisited Exemplar (ResNet-50 ×3)	2019-01-25
Revisiting Self-Supervised Visual Representation Learning	✓ Link	44.6%	68.0%	94M	Revisited Jigsaw (ResNet50 ×2)	2019-01-25
Contrastive Multiview Coding	✓ Link	42.6%		30M	CMC (Alexnet/2)	2019-06-13
Deep Clustering for Unsupervised Learning of Visual Features	✓ Link	41.0		61M	DeepCluster (AlexNet)	2018-07-15
Multi-task Self-Supervised Visual Learning		39.6	62.5	44M	Colorisation (improved) (ResNet-101)	2017-08-25
Unsupervised Representation Learning by Predicting Image Rotations	✓ Link	38.7		86M	Rotation (AlexNet)	2018-03-21
Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction	✓ Link	35.4%		61M	Split-Brain (AlexNet)	2016-11-29
Representation Learning by Learning to Count	✓ Link	34.3		61M	Counting (AlexNet)	2017-08-22
Colorful Image Colorization	✓ Link	32.6%		61M	Colorization (AlexNet)	2016-03-28
Multi-task Self-Supervised Visual Learning			70.2	44M	Multi-task SSL (ResNet-101)	2017-08-25

OpenCodePapers

self-supervised-image-classification-on