question-answering-on-multirc

Question Answering

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	F1	EM	ModelName	ReleaseDate
PaLM: Scaling Language Modeling with Pathways	✓ Link	90.1	69.2	PaLM 540B (finetuned)	2022-04-05
ST-MoE: Designing Stable and Transferable Sparse Expert Models	✓ Link	89.6		ST-MoE-32B 269B (fine-tuned)	2022-02-17
Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE		88.4	63	Turing NLR v5 XXL 5.4B (fine-tuned)	2022-12-04
DeBERTa: Decoding-enhanced BERT with Disentangled Attention	✓ Link	88.2	63.7	DeBERTa-1.5B	2020-06-05
Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE		88.2	62.4	Vega v2 6B (fine-tuned)	2022-12-04
PaLM 2 Technical Report	✓ Link	88.2		PaLM 2-L (one-shot)	2023-05-17
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer	✓ Link	88.1		T5-XXL 11B (fine-tuned)	2019-10-23
ST-MoE: Designing Stable and Transferable Sparse Expert Models	✓ Link	86		ST-MoE-L 4.1B (fine-tuned)	2022-02-17
PaLM 2 Technical Report	✓ Link	84.1		PaLM 2-M (one-shot)	2023-05-17
PaLM 2 Technical Report	✓ Link	84.0		PaLM 2-S (one-shot)	2023-05-17
Finetuned Language Models Are Zero-Shot Learners	✓ Link	83.4		FLAN 137B (prompt-tuned)	2021-09-03
Finetuned Language Models Are Zero-Shot Learners	✓ Link	77.5		FLAN 137B (zero-shot)	2021-09-03
Language Models are Few-Shot Learners	✓ Link	75.4		GPT-3 175B (Few-Shot)	2020-05-28
Finetuned Language Models Are Zero-Shot Learners	✓ Link	72.1		FLAN 137B (1-shot)	2021-09-03
KELM: Knowledge Enhanced Pre-Trained Language Representations with Message Passing on Hierarchical Relational Graphs	✓ Link	70.8	27.2	KELM (finetuning BERT-large based single model)	2021-09-09
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding	✓ Link	70.0	24.1	BERT-large(single model)	2018-10-11
Ask Me Anything: A simple strategy for prompting language models	✓ Link	63.8		Neo-6B (QA + WS)	2022-10-05
BloombergGPT: A Large Language Model for Finance	✓ Link	62.3		Bloomberg GPT 50B (1-shot)	2023-03-30
N-Grammer: Augmenting Transformers with latent n-grams	✓ Link	62	11.3	N-Grammer 343M	2022-07-13
Ask Me Anything: A simple strategy for prompting language models	✓ Link	60.8		Neo-6B (few-shot)	2022-10-05
AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model	✓ Link	59.6		AlexaTM 20B	2022-08-02
Ask Me Anything: A simple strategy for prompting language models	✓ Link	58.8		Neo-6B (QA)	2022-10-05
BloombergGPT: A Large Language Model for Finance	✓ Link	26.7		BLOOM 176B (1-shot)	2023-03-30
BloombergGPT: A Large Language Model for Finance	✓ Link	22.9		GPT-NeoX 20B (1-shot)	2023-03-30
BloombergGPT: A Large Language Model for Finance	✓ Link	18.8		OPT 66B (1-shot)	2023-03-30
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer	✓ Link		63.3	T5-11B	2019-10-23
Hungry Hungry Hippos: Towards Language Modeling with State Space Models	✓ Link		59.7	Hybrid H3 355M (3-shot, logit scoring)	2022-12-28
Hungry Hungry Hippos: Towards Language Modeling with State Space Models	✓ Link		59.5	Hybrid H3 355M (0-shot, logit scoring)	2022-12-28
Hungry Hungry Hippos: Towards Language Modeling with State Space Models	✓ Link		51.4	Hybrid H3 125M (0-shot, logit scoring)	2022-12-28
Hungry Hungry Hippos: Towards Language Modeling with State Space Models	✓ Link		48.9	Hybrid H3 125M (3-shot, logit scoring)	2022-12-28

OpenCodePapers

question-answering-on-multirc