UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark | ✓ Link | 83.2 | Unicorn 11B (fine-tuned) | 2021-03-24 |
MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts | ✓ Link | 82.5 | LLaMA-2 13B + MixLoRA | 2024-04-22 |
Task Compass: Scaling Multi-task Pre-training with Task Prefix | ✓ Link | 82.2 | CompassMTL 567M with Tailor | 2022-10-12 |
Task Compass: Scaling Multi-task Pre-training with Task Prefix | ✓ Link | 81.7 | CompassMTL 567M | 2022-10-12 |
Mixture-of-Subspaces in Low-Rank Adaptation | ✓ Link | 81.0 | LLaMA-3 8B+MoSLoRA (fine-tuned) | 2024-06-16 |
Two is Better than Many? Binary Classification as an Effective Approach to Multi-Choice Question Answering | ✓ Link | 80.2 | DeBERTa-Large 304M | 2022-10-29 |
Two is Better than Many? Binary Classification as an Effective Approach to Multi-Choice Question Answering | ✓ Link | 79.9 | DeBERTa-Large 304M (classification-based) | 2022-10-29 |
UnifiedQA: Crossing Format Boundaries With a Single QA System | ✓ Link | 79.8 | UnifiedQA 3B | 2020-05-02 |
Task Compass: Scaling Multi-task Pre-training with Task Prefix | ✓ Link | 79.6 | ExDeBERTa 567M | 2022-10-12 |
MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts | ✓ Link | 78.8 | LLaMA-3 8B + MixLoRA | 2024-04-22 |
MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts | ✓ Link | 78 | LLaMA-2 7B + MixLoRA | 2024-04-22 |
RoBERTa: A Robustly Optimized BERT Pretraining Approach | ✓ Link | 76.7 | RoBERTa-Large 355M (fine-tuned) | 2019-07-26 |
SocialIQA: Commonsense Reasoning about Social Interactions | ✓ Link | 64.5 | BERT-large 340M (fine-tuned) | 2019-04-22 |
SocialIQA: Commonsense Reasoning about Social Interactions | ✓ Link | 63.1 | BERT-base 110M (fine-tuned) | 2019-04-22 |
SocialIQA: Commonsense Reasoning about Social Interactions | ✓ Link | 63 | GPT-1 117M (fine-tuned) | 2019-04-22 |
Textbooks Are All You Need II: phi-1.5 technical report | ✓ Link | 53.0 | phi-1.5-web 1.3B (zero-shot) | 2023-09-11 |
Textbooks Are All You Need II: phi-1.5 technical report | ✓ Link | 52.6 | phi-1.5 1.3B (zero-shot) | 2023-09-11 |
LLaMA: Open and Efficient Foundation Language Models | ✓ Link | 52.3 | LLaMA 65B (zero-shot) | 2023-02-27 |
Training Compute-Optimal Large Language Models | ✓ Link | 51.3 | Chinchilla (zero-shot) | 2022-03-29 |
Scaling Language Models: Methods, Analysis & Insights from Training Gopher | ✓ Link | 50.6 | Gopher (zero-shot) | 2021-12-08 |
LLaMA: Open and Efficient Foundation Language Models | ✓ Link | 50.4 | LLaMA 13B (zero-shot) | 2023-02-27 |
LLaMA: Open and Efficient Foundation Language Models | ✓ Link | 50.4 | LLaMA 33B (zero-shot) | 2023-02-27 |
LLaMA: Open and Efficient Foundation Language Models | ✓ Link | 48.9 | LLaMA 7B (zero-shot) | 2023-02-27 |
SocialIQA: Commonsense Reasoning about Social Interactions | ✓ Link | 33.3 | Random chance baseline | 2019-04-22 |