OpenCodePapers

question-answering-on-social-iqa

Question Answering
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeAccuracyModelNameReleaseDate
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark✓ Link83.2Unicorn 11B (fine-tuned)2021-03-24
MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts✓ Link82.5LLaMA-2 13B + MixLoRA2024-04-22
Task Compass: Scaling Multi-task Pre-training with Task Prefix✓ Link82.2CompassMTL 567M with Tailor2022-10-12
Task Compass: Scaling Multi-task Pre-training with Task Prefix✓ Link81.7CompassMTL 567M2022-10-12
Mixture-of-Subspaces in Low-Rank Adaptation✓ Link81.0LLaMA-3 8B+MoSLoRA (fine-tuned)2024-06-16
Two is Better than Many? Binary Classification as an Effective Approach to Multi-Choice Question Answering✓ Link80.2DeBERTa-Large 304M2022-10-29
Two is Better than Many? Binary Classification as an Effective Approach to Multi-Choice Question Answering✓ Link79.9DeBERTa-Large 304M (classification-based)2022-10-29
UnifiedQA: Crossing Format Boundaries With a Single QA System✓ Link79.8UnifiedQA 3B2020-05-02
Task Compass: Scaling Multi-task Pre-training with Task Prefix✓ Link79.6ExDeBERTa 567M2022-10-12
MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts✓ Link78.8LLaMA-3 8B + MixLoRA2024-04-22
MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts✓ Link78LLaMA-2 7B + MixLoRA2024-04-22
RoBERTa: A Robustly Optimized BERT Pretraining Approach✓ Link76.7RoBERTa-Large 355M (fine-tuned)2019-07-26
SocialIQA: Commonsense Reasoning about Social Interactions✓ Link64.5BERT-large 340M (fine-tuned)2019-04-22
SocialIQA: Commonsense Reasoning about Social Interactions✓ Link63.1BERT-base 110M (fine-tuned)2019-04-22
SocialIQA: Commonsense Reasoning about Social Interactions✓ Link63GPT-1 117M (fine-tuned)2019-04-22
Textbooks Are All You Need II: phi-1.5 technical report✓ Link53.0phi-1.5-web 1.3B (zero-shot)2023-09-11
Textbooks Are All You Need II: phi-1.5 technical report✓ Link52.6phi-1.5 1.3B (zero-shot)2023-09-11
LLaMA: Open and Efficient Foundation Language Models✓ Link52.3LLaMA 65B (zero-shot)2023-02-27
Training Compute-Optimal Large Language Models✓ Link51.3Chinchilla (zero-shot)2022-03-29
Scaling Language Models: Methods, Analysis & Insights from Training Gopher✓ Link50.6Gopher (zero-shot)2021-12-08
LLaMA: Open and Efficient Foundation Language Models✓ Link50.4LLaMA 13B (zero-shot)2023-02-27
LLaMA: Open and Efficient Foundation Language Models✓ Link50.4LLaMA 33B (zero-shot)2023-02-27
LLaMA: Open and Efficient Foundation Language Models✓ Link48.9LLaMA 7B (zero-shot)2023-02-27
SocialIQA: Commonsense Reasoning about Social Interactions✓ Link33.3Random chance baseline2019-04-22