OpenCodePapers

common-sense-reasoning-on-record

Common Sense Reasoning
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeEMF1ModelNameReleaseDate
Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE95.996.4Turing NLR v5 XXL 5.4B (fine-tuned)2022-12-04
ST-MoE: Designing Stable and Transferable Sparse Expert Models✓ Link95.1ST-MoE-32B 269B (fine-tuned)2022-02-17
DeBERTa: Decoding-enhanced BERT with Disentangled Attention✓ Link94.194.5DeBERTa-1.5B2020-06-05
PaLM: Scaling Language Modeling with Pathways✓ Link94.094.6PaLM 540B (finetuned) 2022-04-05
Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE93.994.4Vega v2 6B (fine-tuned)2022-12-04
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer✓ Link93.4T5-XXL 11B (fine-tuned)2019-10-23
Integrating a Heterogeneous Graph with Entity-aware Self-attention using Relative Position Labels for Reading Comprehension Model91.792.2GESA 500M2023-07-19
LUKE-Graph: A Transformer-based Approach with Gated Relational Graph Attention for Cloze-style Reading Comprehension91.291.5LUKE-Graph2023-03-12
[]()90.64091.209LUKE (single model)
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention✓ Link90.691.2LUKE 483M2020-10-02
KELM: Knowledge Enhanced Pre-Trained Language Representations with Message Passing on Hierarchical Relational Graphs✓ Link89.189.6KELM (finetuning RoBERTa-large based single model)2021-09-09
ST-MoE: Designing Stable and Transferable Sparse Expert Models✓ Link88.9ST-MoE-L 4.1B (fine-tuned)2022-02-17
Finetuned Language Models Are Zero-Shot Learners✓ Link85.1FLAN 137B (prompt-tuned)2021-09-03
[]()83.09083.737XLNet + MTL + Verifier (ensemble)
Language Models are Few-Shot Learners✓ Link82.1GPT-3 Large 760M (0-shot)2020-05-28
[]()81.78082.584CSRLM (single model)
Pingan Smart Health and SJTU at COIN - Shared Task: utilizing Pre-trained Language Models and Common-sense Knowledge in Machine Reading Tasks81.582.7XLNet + Verifier2019-11-01
[]()81.46082.664XLNet + MTL + Verifier (single model)
Efficient Language Modeling with Sparse all-MLP79.9Switch Transformer 9B2022-03-14
[]()79.48080.038{SKG-NET} (single model)
KELM: Knowledge Enhanced Pre-Trained Language Representations with Message Passing on Hierarchical Relational Graphs✓ Link76.276.7KELM (finetuning BERT-large based single model)2021-09-09
Efficient Language Modeling with Sparse all-MLP73.4sMLP – deterministic 9.4B (0-shot)2022-03-14
Finetuned Language Models Are Zero-Shot Learners✓ Link72.5FLAN 137B (zero-shot)2021-09-03
Efficient Language Modeling with Sparse all-MLP72.4Gshard 9B2022-03-14
[]()72.24072.778SKG-BERT (single model)
[]()71.60073.620KT-NET (single model)
[]()69.49071.138DCReader+BERT (single model)
Efficient Language Modeling with Sparse all-MLP67.2HASH Layers 10B (0-shot)2022-03-14
[]()60.80062.986GraphBert (single)
Efficient Language Modeling with Sparse all-MLP60.7Base Layers 10B (0-shot)2022-03-14
[]()59.86061.885GraphBert-WordNet (single)
[]()59.41061.515GraphBert-NELL (single)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding✓ Link54.04056.065BERT-Base (single model)2018-10-11
ReCoRD: Bridging the Gap between Human and Machine Commonsense Reading Comprehension45.446.7DocQA + ELMo2018-10-30
N-Grammer: Augmenting Transformers with latent n-grams✓ Link28.929.9N-Grammer 343M2022-07-13
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer✓ Link94.1T5-11B2019-10-23
PaLM 2 Technical Report✓ Link93.8PaLM 2-L (one-shot)2023-05-17
PaLM 2 Technical Report✓ Link92.4PaLM 2-M (one-shot)2023-05-17
PaLM 2 Technical Report✓ Link92.1PaLM 2-S (one-shot)2023-05-17
Large Language Models are Zero-Shot Reasoners✓ Link90.2GPT-3 175B (one-shot)2022-05-24
AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model✓ Link88.4AlexaTM 20B2022-08-02
BloombergGPT: A Large Language Model for Finance✓ Link82.8Bloomberg GPT 50B (1-shot)2023-03-30
BloombergGPT: A Large Language Model for Finance✓ Link82.5OPT 66B (1-shot)2023-03-30
BloombergGPT: A Large Language Model for Finance✓ Link78BLOOM 176B (1-shot)2023-03-30
BloombergGPT: A Large Language Model for Finance✓ Link67.9GPT-NeoX 20B (1-shot)2023-03-30