OpenCodePapers

question-answering-on-newsqa

Question Answering

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	EM	F1	ModelName	ReleaseDate
o3-mini vs DeepSeek-R1: Which One is Safer?	✓ Link	92.52	93.13	OpenAI/o3-2025-01-31-high	2025-01-30
DeepSense: A Unified Deep Learning Framework for Time-Series Mobile Sensing Data Processing	✓ Link	92.14	94.01	Riple/Saanvi-v0.5-DeepAnalysis	2016-11-07
Thinking Like Transformers	✓ Link	88.24	91.31	OpenAI/o4-mini-2025-05-01-high	2021-06-13
0/1 Deep Neural Networks via Block Coordinate Descent		81.44	88.72	OpenAI/o1-2024-12-17-high	2022-06-19
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning	✓ Link	80.57	86.13	deepseek-r1	2025-01-22
Claude 3.5 Sonnet Model Card Addendum		74.23	82.3	Anthropic/claude-3-7-sonnet	2024-06-24
Time-series Transformer Generative Adversarial Networks	✓ Link	72.61	85.44	Riple/Saanvi-v0.1	2022-05-23
XAI for Transformers: Better Explanations through Conservative Propagation	✓ Link	70.57	88.24	xAI/grok-3-1212	2022-02-15
GPT-4o as the Gold Standard: A Scalable and General Purpose Approach to Filter Language Model Pretraining Data		70.21	81.74	OpenAI/GPT-4o	2024-10-03
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context	✓ Link	68.75	79.91	Google/Gemini 2.5 Pro	2024-03-08
Learning to Generate Questions by Learning to Recover Answer-containing Sentences		54.7	64.5	BERT+ASGen
Densely Connected Attention Propagation for Reading Comprehension	✓ Link	53.1	66.3	DecaProp	2018-11-10
Efficient and Robust Question Answering from Minimal Context over Documents	✓ Link	50.1	63.2	MINIMAL(Dyn)	2018-05-21
A Question-Focused Multi-Factor Attention Network for Question Answering	✓ Link	48.4	63.7	AMANDA	2018-01-25
Making Neural QA as Simple as Possible but not Simpler	✓ Link	43.7	56.1	FastQAExt	2017-03-14
SpanBERT: Improving Pre-training by Representing and Predicting Spans	✓ Link		73.6	SpanBERT	2019-07-24
LinkBERT: Pretraining Language Models with Document Links	✓ Link		72.6	LinkBERT (large)	2022-03-29
DyREx: Dynamic Query Representation for Extractive Question Answering	✓ Link		68.53	DyREX	2022-10-26