Paper | Code | Top 1 Accuracy | ModelName | ReleaseDate |
---|---|---|---|---|
Scaling Vision with Sparse Mixture of Experts | ✓ Link | 82.78 | ViT-MoE-15B (Every-2) | 2021-06-10 |
The effectiveness of MAE pre-pretraining for billion-scale pretraining | ✓ Link | 82.6 | MAWS (ViT-6.5B) | 2023-03-23 |
The effectiveness of MAE pre-pretraining for billion-scale pretraining | ✓ Link | 81.5 | MAWS (ViT-2B) | 2023-03-23 |
The effectiveness of MAE pre-pretraining for billion-scale pretraining | ✓ Link | 79.8 | MAWS (ViT-H) | 2023-03-23 |
Scaling Vision with Sparse Mixture of Experts | ✓ Link | 78.21 | V-MoE-H/14 (Every-2) | 2021-06-10 |
Scaling Vision with Sparse Mixture of Experts | ✓ Link | 78.08 | V-MoE-H/14 (Last-5) | 2021-06-10 |
Scaling Vision with Sparse Mixture of Experts | ✓ Link | 77.1 | V-MoE-L/16 (Every-2) | 2021-06-10 |
Scaling Vision with Sparse Mixture of Experts | ✓ Link | 76.95 | VIT-H/14 | 2021-06-10 |