question-answering-on-squad11

Question Answering

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	EM	F1	Hardware Burden	Exact Match	ModelName	ReleaseDate
[]()		90.622	95.719			{ANNA} (single model)
[]()		90.202	95.379			LUKE (single model)
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention	✓ Link	90.202	95.379			LUKE (single model)	2020-10-02
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention	✓ Link	90.2				LUKE	2020-10-02
[]()		89.898	95.080			XLNet (single model)
XLNet: Generalized Autoregressive Pretraining for Language Understanding	✓ Link	89.898	95.080	46449G		XLNet (single model)	2019-06-19
[]()		89.856	94.903			XLNET-123++ (single model)
[]()		89.709	94.859			XLNET-123+ (single model)
[]()		89.646	94.930			XLNET-123 (single model)
[]()		88.912	94.584			Unnamed submission by NMC
[]()		88.912	94.584			BERTSP (single model)
[]()		88.839	94.635			SpanBERT (single model)
SpanBERT: Improving Pre-training by Representing and Predicting Spans	✓ Link	88.8	94.6	586G		SpanBERT (single model)	2019-07-24
[]()		88.650	94.393			BERT+WWM+MT (single model)
[]()		87.465	93.294			Tuned BERT-1seq Large Cased (single model)
LinkBERT: Pretraining Language Models with Document Links	✓ Link	87.45	92.7			LinkBERT (large)	2022-03-29
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding	✓ Link	87.433	93.160			BERT (ensemble)	2018-10-11
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding	✓ Link	87.4	93.2			BERT-LARGE (Ensemble+TriviaQA)	2018-10-11
[]()		86.940	92.641			ATB (single model)
[]()		86.521	92.617			Tuned BERT Large Cased (single model)
[]()		86.458	92.645			BERT+MT (single model)
[]()		85.944	92.425			Knowledge-enhanced BERT (single model)
[]()		85.944	92.425			KT-NET (single model)
[]()		85.430	91.976			ST_bl
[]()		85.356	91.202			nlnet (ensemble)
[]()		85.335	91.807			EL-BERT (single model)
[]()		85.314	91.756			BISAN (single model)
[]()		85.125	91.623			BERT+Sparse-Transformer
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding	✓ Link	85.083	91.835			BERT (single model)	2018-10-11
[]()		84.978	92.019			DPN (single model)
[]()		84.926	91.932			BERT-uncased (single model)
[]()		84.402	90.561			WD (single model)
[]()		84.328	91.281			Original BERT Large Cased (single model)
[]()		83.982	89.796			MARS (ensemble)
[]()		83.930	90.613			Common-sense Governed BERT-123 (single model)
[]()		83.804	90.429			WD1 (single model)
[]()		83.468	90.133			nlnet (single model)
[]()		83.426	89.218			Pytalk + Stanza + BERT (single model)
[]()		82.849	88.764			Reinforced Mnemonic Reader + A2D (ensemble model)
[]()		82.681	89.379			BERT-Base mod (single model)
[]()		82.650	88.493			r-net+ (ensemble)
[]()		82.482	89.281			Hybrid AoA Reader (ensemble)
[]()		82.471	89.306			QANet (single)
[]()		82.440	88.607			SLQA+ (ensemble)
Reinforced Mnemonic Reader for Machine Reading Comprehension	✓ Link	82.283	88.533			Reinforced Mnemonic Reader (ensemble model)	2017-05-08
[]()		82.136	88.126			r-net (ensemble)
[]()		82.062	88.947			BERT (single model)
[]()		81.790	88.163			AttentionReader+ (ensemble)
[]()		81.580	88.948			MMIPN
Information Theoretic Representation Distillation	✓ Link	81.5	88.5			BERT - 6 Layers	2021-12-01
[]()		81.496	87.557			KACTEIL-MRC(GF-Net+) (ensemble)
[]()		81.401	88.122			Reinforced Mnemonic Reader + A2D + DA (single model)
[]()		81.307	88.909			ARSG-BERT (single model)
[]()		81.045	87.999			BERT-COMPOUND-DSS (single model)
Deep contextualized word representations	✓ Link	81.003	87.432			BiDAF + Self Attention + ELMo (ensemble)	2018-02-15
[]()		81.003	87.432			BiDAF + Self Attention + ELMo (ensemble)
[]()		80.720	87.758			BERT-COMPOUND (single model)
[]()		80.667	88.169			mBERT + Task Adapter (Single)
[]()		80.615	87.311			AVIQA+ (ensemble)
[]()		80.489	87.454			Reinforced Mnemonic Reader + A2D (single model)
[]()		80.436	87.021			SLQA+
[]()		80.436	86.912			{EAZI} (ensemble)
[]()		80.426	86.912			EAZI+ (ensemble)
[]()		80.164	86.721			DNET (ensemble)
[]()		80.027	87.288			Hybrid AoA Reader (single model)
[]()		79.996	86.711			BiDAF + Self Attention + ELMo + A2D (single model)
[]()		79.901	86.536			r-net+ (single model)
[]()		79.859	88.263			batch (single model)
A Multi-Stage Memory Augmented Neural Network for Machine Reading Comprehension		79.692	86.727			MAMCN+ (single model)	2018-07-01
[]()		79.692	86.727			MAMCN+ (single model)
Stochastic Answer Networks for Machine Reading Comprehension	✓ Link	79.608	86.496			SAN (ensemble model)	2017-12-10
[]()		79.597	87.374			BERT-INDEPENDENT-DSS-FILTERED (single model)
Reinforced Mnemonic Reader for Machine Reading Comprehension	✓ Link	79.545	86.654			Reinforced Mnemonic Reader (single model)	2017-05-08
[]()		79.199	86.590			SLQA+ (single model)
[]()		79.083	86.450			Interactive AoA Reader+ (ensemble)
[]()		79.083	86.288			MIR-MRC(F-Net) (single model)
[]()		79.083	86.288			KACTEIL-MRC(GF-Net+Distillation) (single model)
[]()		79.083	86.288			KACTEIL-MRC (GF-Net+Distillation)
[]()		79.031	86.006			MDReader
FusionNet: Fusing via Fully-Aware Attention with Application to Machine Comprehension	✓ Link	78.978	86.016			FusionNet (ensemble)	2017-11-16
DCN+: Mixed Objective and Deep Residual Coattention for Question Answering	✓ Link	78.852	85.996			DCN+ (ensemble)	2017-10-31
[]()		78.664	85.780			KACTEIL-MRC(GF-Net+) (single model)
[]()		78.664	85.780			KACTEIL-MRC (GF-Net+)
[]()		78.653	86.663			BERT-INDEPENDENT (single model)
Deep contextualized word representations	✓ Link	78.58	85.833			BiDAF + Self Attention + ELMo (single model)	2018-02-15
[]()		78.580	85.833			BiDAF + Self Attention + ELMo (single model)
[]()		78.496	85.469			aviqa (ensemble)
[]()		78.401	85.724			KakaoNet (single model)
[]()		78.328	85.682			SLQA(ensemble)
[]()		78.328	85.682			SLQA (ensemble)
MEMEN: Multi-layer Embedding with Memory Networks for Machine Comprehension		78.234	85.344			MEMEN (single model)	2017-07-28
MEMEN: Multi-layer Embedding with Memory Networks for Machine Comprehension		78.234	85.344			MEMEN (single model)	2017-07-28
[]()		78.223	85.535			BiDAF++ with pair2vec (single model)
[]()		78.171	85.543			MDReader0
[]()		78.087	85.348			test
[]()		77.845	85.297			Interactive AoA Reader (ensemble)
Information Theoretic Representation Distillation	✓ Link	77.7	85.8			BERT - 3 Layers	2021-12-01
[]()		77.646	84.905			DNET (single model)
Contextualized Word Representations for Reading Comprehension	✓ Link	77.583	84.163			RaSoR + TR + LM (single model)	2017-12-10
[]()		77.573	84.858			BiDAF++ (single model)
[]()		77.342	84.925			AttentionReader+ (single)
[]()		77.237	84.466			Jenga (ensemble)
[]()		77.090	83.931			{gqa} (single model)
Phase Conductor on Multi-layered Attentions for Machine Comprehension		76.996	84.630			Conductor-net (ensemble)	2017-10-28
[]()		76.859	84.739			MARS (single model)
Stochastic Answer Networks for Machine Reading Comprehension	✓ Link	76.828	84.396			SAN (single model)	2017-12-10
[]()		76.775	84.491			VS^3-NET (single model)
[]()		76.461	84.265			r-net (single model)
Gated Self-Matching Networks for Reading Comprehension and Question Answering		76.461	84.265			r-net (single model)	2017-07-01
[]()		76.240	84.599			FRC (single model)
QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension	✓ Link	76.2	84.6			QANet + data augmentation ×3	2018-04-23
[]()		76.146	83.991			Conductor-net (ensemble)
Explicit Utilization of General Knowledge in Machine Reading Comprehension		76.125	83.538			KAR (single model)	2018-09-10
[]()		75.989	83.475			smarnet (ensemble)
FusionNet: Fusing via Fully-Aware Attention with Application to Machine Comprehension	✓ Link	75.968	83.900			FusionNet (single model)	2017-11-16
[]()		75.926	83.305			AVIQA-v2 (single model)
[]()		75.821	83.843			Interactive AoA Reader+ (single model)
Contextualized Word Representations for Reading Comprehension	✓ Link	75.789	83.261			RaSoR + TR (single model)	2017-12-10
MEMEN: Multi-layer Embedding with Memory Networks for Machine Comprehension		75.370	82.658			MEMEN (ensemble)	2017-07-28
[]()		75.265	82.769			Mixed model (ensemble)
[]()		75.223	82.716			two-attention-self-attention (ensemble)
[]()		75.034	83.405			Kbs (single model)
ReasoNet: Learning to Stop Reading in Machine Comprehension		75.034	82.552			ReasoNet (ensemble)	2016-09-17
EfficientQA : a RoBERTa Based Phrase-Indexed Question-Answering System		74.9	83.1			EfficientQA 125M	2021-01-06
DCN+: Mixed Objective and Deep Residual Coattention for Question Answering	✓ Link	74.866	82.806			DCN+ (single model)	2017-10-31
[]()		74.604	82.501			eeAttNet (single model)
[]()		74.489	82.815			SLQA (single model)
Phase Conductor on Multi-layered Attentions for Machine Comprehension		74.405	82.742			Conductor-net (single model)	2017-10-28
Reinforced Mnemonic Reader for Machine Reading Comprehension	✓ Link	74.268	82.371			Mnemonic Reader (ensemble)	2017-05-08
[]()		74.121	82.342			S^3-Net (ensemble)
Structural Embedding of Syntactic Trees for Machine Comprehension		74.090	81.761			SEDT (ensemble model)	2017-03-02
[]()		74.080	81.665			SSAE (ensemble)
Multi-Perspective Context Matching for Machine Comprehension	✓ Link	73.765	81.257			Multi-Perspective Matching (ensemble)	2016-12-13
Bidirectional Attention Flow for Machine Comprehension	✓ Link	73.744	81.525			BiDAF (ensemble)	2016-11-05
Structural Embedding of Syntactic Trees for Machine Comprehension		73.723	81.530			SEDT+BiDAF (ensemble)	2017-03-02
[]()		73.639	81.931			Interactive AoA Reader (single model)
[]()		73.303	81.754			Jenga (single model)
Phase Conductor on Multi-layered Attentions for Machine Comprehension		73.240	81.933			Conductor-net (single)	2017-10-28
Exploring Question Understanding and Adaptation in Neural-Network-Based Question Answering		73.010	81.517			jNet (ensemble)	2017-03-14
[]()		72.758	81.001			T-gating (ensemble)
[]()		72.600	81.011			two-attention-self-attention (single model)
[]()		72.590	81.415			Conductor-net (single)
[]()		72.485	80.550			AVIQA (single model)
Simple and Effective Multi-Paragraph Reading Comprehension	✓ Link	72.139	81.048			BiDAF + Self Attention (single model)	2017-10-29
[]()		71.908	81.023			S^3-Net (single model)
[]()		71.898	79.989			QFASE
[]()		71.698	80.462			attention+self-attention (single model)
Dynamic Coattention Networks For Question Answering	✓ Link	71.625	80.383			Dynamic Coattention Networks (ensemble)	2016-11-05
Smarnet: Teaching Machines to Read and Comprehend Like Human		71.415	80.160			smarnet (single model)	2017-10-08
Simple Recurrent Units for Highly Parallelizable Recurrence	✓ Link	71.4	80.2	4G		SRU	2017-09-08
[]()		71.373	79.725			AttReader (single)
Learned in Translation: Contextualized Word Vectors	✓ Link	71.3	79.9			DCN + Char + CoVe	2017-08-01
[]()		71.016	79.835			M-NET (single)
Reinforced Mnemonic Reader for Machine Reading Comprehension	✓ Link	70.995	80.146			Mnemonic Reader (single model)	2017-05-08
[]()		70.985	79.939			MAMCN (single model)
Making Neural QA as Simple as Possible but not Simpler	✓ Link	70.849	78.857			FastQAExt	2017-03-14
Learning Recurrent Span Representations for Extractive Question Answering	✓ Link	70.849	78.741			RaSoR (single model)	2016-11-04
Reading Wikipedia to Answer Open-Domain Questions	✓ Link	70.733	79.353			Document Reader (single model)	2017-03-31
Ruminating Reader: Reasoning with Gated Multi-Hop Attention		70.639	79.456			Ruminating Reader (single model)	2017-04-24
Exploring Question Understanding and Adaptation in Neural-Network-Based Question Answering		70.607	79.821			jNet (single model)	2017-03-14
ReasoNet: Learning to Stop Reading in Machine Comprehension		70.555	79.364			ReasoNet (single model)	2016-09-17
Multi-Perspective Context Matching for Machine Comprehension	✓ Link	70.387	78.784			Multi-Perspective Matching (single model)	2016-12-13
[]()		69.600	78.236			SimpleBaseline (single model)
[]()		69.443	78.358			SSR-BiDAF
Structural Embedding of Syntactic Trees for Machine Comprehension		68.478	77.971			SEDT+BiDAF (single model)	2017-03-02
Making Neural QA as Simple as Possible but not Simpler	✓ Link	68.436	77.070			FastQA	2017-03-14
[]()		68.331	77.783			PQMN (single model)
Structural Embedding of Syntactic Trees for Machine Comprehension		68.163	77.527			SEDT (single model)	2017-03-02
[]()		68.132	77.569			T-gating (single model)
Bidirectional Attention Flow for Machine Comprehension	✓ Link	67.974	77.323			BiDAF (single model)	2016-11-05
Machine Comprehension Using Match-LSTM and Answer Pointer	✓ Link	67.901	77.022			Match-LSTM with Ans-Ptr (Boundary) (ensemble)	2016-08-29
A Fully Attention-Based Information Retriever	✓ Link	67.744	77.605			FABIR	2018-10-22
[]()		67.618	77.151			AllenNLP BiDAF (single model)
[]()		67.544	76.429			BIDAF-COMPOUND-DSS (single model)
[]()		67.502	76.786			Iterative Co-attention Network
[]()		66.527	75.787			newtest
[]()		66.516	76.349			BIDAF-INDEPENDENT-DSS (single model)
Dynamic Coattention Networks For Question Answering	✓ Link	66.233	75.896			Dynamic Coattention Networks (single model)	2016-11-05
[]()		65.163	74.555			BIDAF-COMPOUND (single model)
[]()		64.932	74.594			BIDAF-INDEPENDENT (single model)
Machine Comprehension Using Match-LSTM and Answer Pointer	✓ Link	64.744	73.743			Match-LSTM with Bi-Ans-Ptr (Boundary)	2016-08-29
[]()		64.439	73.921			Unnamed submission by ravioncodalab
Learning to Compute Word Embeddings On the Fly		64.083	73.056			OTF dict+spelling (single)	2017-06-01
[]()		63.306	73.463			Attentive CNN context with LSTM
Learning to Compute Word Embeddings On the Fly		62.897	72.016			OTF spelling (single)	2017-06-01
Learning to Compute Word Embeddings On the Fly		62.604	71.968			OTF spelling+lemma (single)	2017-06-01
End-to-End Answer Chunk Extraction and Ranking for Reading Comprehension		62.499	70.956			Dynamic Chunk Reader	2016-10-31
Words or Characters? Fine-grained Gating for Reading Comprehension	✓ Link	62.446	73.327			Fine-Grained Gating	2016-11-06
[]()		61.145	71.389			RQA+IDR (single model)
Harvesting and Refining Question-Answer Pairs for Unsupervised QA	✓ Link	61.145	71.389			RQA+IDR (single model)	2020-05-06
Machine Comprehension Using Match-LSTM and Answer Pointer	✓ Link	60.474	70.695			Match-LSTM with Ans-Ptr (Boundary)	2016-08-29
[]()		59.058	69.436			Unnamed submission by Will_Wu
[]()		55.827	65.467			RQA (single model)
Harvesting and Refining Question-Answer Pairs for Unsupervised QA	✓ Link	55.827	65.467			RQA (single model)	2020-05-06
Machine Comprehension Using Match-LSTM and Answer Pointer	✓ Link	54.505	67.748			Match-LSTM with Ans-Ptr (Sentence)	2016-08-29
[]()		53.698	64.036			UQA (single model)
[]()		52.544	62.780			Unnamed submission by jinhyuklee
[]()		52.533	62.757			Unnamed submission by minjoon
[]()		47.341	56.436			UnsupervisedQA V1 (ensemble)
[]()		44.215	54.723			UnsupervisedQA V1 (single model)
[]()		12.273	13.211			QANet (single model)
[]()		0.000	6.907
[]()		0.000	0.000			QANet (ensemble)
[]()		0.000	0.000			superman-new-des
[]()		0.000	0.000			WAHnGREA
[]()		0.000	0.000			superman-des
[]()		0.000	0.000			XLNet-deep (ensemble)
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention	✓ Link		95.4			LUKE 483M	2020-10-02
TextBox 2.0: A Text Generation Library with Pre-trained Language Models	✓ Link		93.04		86.44	BART (TextBox 2.0)	2022-12-26
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding	✓ Link		91.8			BERT-LARGE (Single+TriviaQA)	2018-10-11
A Large Batch Optimizer Reality Check: Traditional, Generic Optimizers Suffice Across Batch Sizes			91.58			BERT-Large 32k batch size with AdamW	2021-02-12
DyREx: Dynamic Query Representation for Extractive Question Answering	✓ Link		91.01			DyREX	2022-10-26
Adaptation of Deep Bidirectional Multilingual Transformers for Russian Language	✓ Link		84.6			RuBERT	2019-05-17

OpenCodePapers

question-answering-on-squad11