OpenCodePapers

knowledge-distillation-on-imagenet

Knowledge Distillation
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeTop-1 accuracy %model sizeCRD training settingModelNameReleaseDate
ScaleKD: Strong Vision Transformers Could Be Excellent Teachers✓ Link86.4387MScaleKD (T:BEiT-L S:ViT-B/14)2024-11-11
ScaleKD: Strong Vision Transformers Could Be Excellent Teachers✓ Link85.5387MScaleKD (T:Swin-L S:ViT-B/16)2024-11-11
ScaleKD: Strong Vision Transformers Could Be Excellent Teachers✓ Link83.9322MScaleKD (T:Swin-L S:ViT-S/16)2024-11-11
ScaleKD: Strong Vision Transformers Could Be Excellent Teachers✓ Link83.827MScaleKD (T:Swin-L S:Swin-T)2024-11-11
Improving Knowledge Distillation via Regularizing Feature Norm and Direction✓ Link83.6087MKD++(T: regnety-16GF S:ViT-B)2023-05-26
$V_kD:$ Improving Knowledge Distillation using Orthogonal Projections✓ Link82.922MVkD (T:RegNety 160 S:DeiT-S)2024-03-10
SpectralKD: A Unified Framework for Interpreting and Distilling Vision Transformers via Spectral Analysis✓ Link82.722MSpectralKD (T:Swin-S S:Swin-T)2024-12-26
ScaleKD: Strong Vision Transformers Could Be Excellent Teachers✓ Link82.5522MScaleKD (T:Swin-L S:ResNet-50)2024-11-11
Knowledge Diffusion for Distillation✓ Link82.5DiffKD (T:Swin-L S: Swin-T)2023-05-25
Knowledge Distillation from A Stronger Teacher✓ Link82.329MDIST (T: Swin-L S: Swin-T)2022-05-21
SpectralKD: A Unified Framework for Interpreting and Distilling Vision Transformers via Spectral Analysis✓ Link82.222MSpectralKD (T:Cait-S24 S:DeiT-S)2024-12-26
Understanding the Role of the Projector in Knowledge Distillation✓ Link82.122MSRD (T:RegNety 160 S:DeiT-S)2023-03-20
One-for-All: Bridge the Gap Between Heterogeneous Architectures in Knowledge Distillation✓ Link81.33OFA (T: ViT-B S: ResNet-50)2023-10-30
Knowledge Diffusion for Distillation✓ Link80.5DiffKD (T:Swin-L S: ResNet-50)2023-05-25
$V_kD:$ Improving Knowledge Distillation using Orthogonal Projections✓ Link79.26MVkD (T:RegNety 160 S:DeiT-Ti)2024-03-10
Improving Knowledge Distillation via Regularizing Feature Norm and Direction✓ Link79.1544.5MKD++(T:resnet-152 S:resnet-101)2023-05-26
Ensemble Knowledge Distillation for Learning Improved and Efficient Networks✓ Link78.7956.9MADLIK-MO-P25(T:SeNet154, ResNet152b S:ResNet-50-prune25%)2019-09-17
Ensemble Knowledge Distillation for Learning Improved and Efficient Networks✓ Link78.0740.5MADLIK-MO-P375(T:SeNet154, ResNet152b S:ResNet-50-prune37.5)2019-09-17
Improving Knowledge Distillation via Regularizing Feature Norm and Direction✓ Link77.48KD++(T:resnet-152 S:resnet-50)2023-05-26
SpectralKD: A Unified Framework for Interpreting and Distilling Vision Transformers via Spectral Analysis✓ Link77.46MSpectralKD (T:Cait-S24 S:DeiT-T)2024-12-26
Understanding the Role of the Projector in Knowledge Distillation✓ Link77.26MSRD (T:RegNety 160 S:DeIT-Ti)2023-03-20
Distilling the Knowledge in a Neural Network✓ Link77.1499MADLIK-MO(T: ResNet101 S: ResNet50)2015-03-09
Knowledge Distillation Based on Transformed Teacher Matching✓ Link77.03WTTM (T: DeiT III-Small S:DeiT-Tiny)2024-02-17
Ensemble Knowledge Distillation for Learning Improved and Efficient Networks✓ Link76.37627MADLIK-MO-P50(T:SeNet154, ResNet152b S:ResNet-50-half)2019-09-17
Improving Knowledge Distillation via Regularizing Feature Norm and Direction✓ Link75.53KD++(T:resnet152 S:resnet34)2023-05-26
Knowledge Distillation Based on Transformed Teacher Matching✓ Link73.09WTTM (T:resnet50, S:mobilenet-v1)2024-02-17
Improving Knowledge Distillation via Regularizing Feature Norm and Direction✓ Link72.96ReviewKD++(T:resnet50, S:mobilenet-v1)2023-05-26
Improving Knowledge Distillation via Regularizing Feature Norm and Direction✓ Link72.54KD++(T:resnet-152 S:resnet18)2023-05-26
Improving Knowledge Distillation via Regularizing Feature Norm and Direction✓ Link72.54KD++(T:renset101 S:resnet18)2023-05-26
Improving Knowledge Distillation via Regularizing Feature Norm and Direction✓ Link72.53KD++(T:resnet50 S:resnet18)2023-05-26
Hierarchical Self-supervised Augmented Knowledge Distillation✓ Link72.39HSAKD (T: ResNet-34 S:ResNet-18)2021-07-29
Exploring Inter-Channel Correlation for Diversity-Preserved Knowledge Distillation✓ Link72.19ICKD (T: ResNet-34 S:ResNet-18)2021-01-01
Knowledge Distillation Based on Transformed Teacher Matching✓ Link72.19WTTM (T: ResNet-34 S:ResNet-18)2024-02-17
Knowledge Distillation from A Stronger Teacher✓ Link72.07DIST (T: ResNet-34 S:ResNet-18)2022-05-21
Improving Knowledge Distillation via Regularizing Feature Norm and Direction✓ Link72.07KD++(T: ResNet-34 S:ResNet-18)2023-05-26
Rethinking Soft Labels for Knowledge Distillation: A Bias-Variance Tradeoff Perspective✓ Link72.04WSL (T: ResNet-34 S:ResNet-18)2021-02-01
Complementary Relation Contrastive Distillation✓ Link71.96CRCD (T: ResNet-34 S:ResNet-18)2021-03-29
Understanding the Role of the Projector in Knowledge Distillation✓ Link71.87SRD (T: ResNet-34 S:ResNet-18)2023-03-20
Improving Knowledge Distillation via Regularizing Feature Norm and Direction✓ Link71.84KD++(T:ViT-B, S:resnet18)2023-05-26
Distilling Knowledge by Mimicking Features✓ Link71.72LSHFM (T: ResNet-34 S:ResNet-18)2020-11-03
Information Theoretic Representation Distillation✓ Link71.6811.69MITRD (T: ResNet-34 S:ResNet-18)2021-12-01
Distilling Global and Local Logits With Densely Connected Relations✓ Link71.63GLD (T: ResNet-34 S:ResNet-18)2021-01-01
Knowledge Distillation Meets Self-Supervision✓ Link71.62SSKD (T: ResNet-34 S:ResNet-18)2020-06-12
Distilling Knowledge via Knowledge Review✓ Link71.61Knowledge Review (T: ResNet-34 S:ResNet-18)2021-04-19
Adaptive Distillation: Aggregating Knowledge from Multiple Paths for Efficient Distillation✓ Link71.61Adaptive (T:ResNet-50 S:ResNet-18)2021-10-19
Improving Knowledge Distillation via Regularizing Feature Norm and Direction✓ Link71.46KD++(T: ViT-S, S:resnet18)2023-05-26
Show, Attend and Distill:Knowledge Distillation via Attention-based Feature Matching✓ Link71.38AFD (T: ResNet-34 S:ResNet-18)2021-02-05
Contrastive Representation Distillation✓ Link71.38CRD (T: ResNet-34 S:ResNet-18)2019-10-23
A Comprehensive Overhaul of Feature Distillation✓ Link70.81Overhual (T: ResNet-34 S:ResNet-18)2019-04-03
Distilling the Knowledge in a Neural Network✓ Link70.66KD (T: ResNet-34 S:ResNet-18)2015-03-09
Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer✓ Link70.70AT (T: ResNet-34 S:ResNet-18)2016-12-12
Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer✓ LinkAT (T: ResNet-34 S:ResNet-18)2016-12-12