OpenCodePapers

question-answering-on-drop-test

Question Answering
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeF1ModelNameReleaseDate
Question Directed Graph Attention Network for Numerical Reasoning over Text88.38QDGAT (ensemble)2020-09-16
Reasoning Like Program Executors✓ Link87.6POET2022-01-27
PaLM 2 Technical Report✓ Link85.0PaLM 2 (few-shot)2023-05-17
Giving BERT a Calculator: Finding Operations and Arguments with Reading Comprehension81.78BERT+Calculator (ensemble)2019-08-31
Neural Symbolic Reader: Scalable Integration of Distributed and Symbolic Representations for Reading Comprehension81.71NeRd2020-05-01
GPT-4 Technical Report✓ Link80.9GPT-4 (few-shot, k=3)2023-03-15
A Simple and Effective Model for Answering Multi-span Questions✓ Link80.7TASE-BERT2019-09-29
A Multi-Type Multi-Span Network for Reading Comprehension that Requires Discrete Reasoning✓ Link79.88MTMSN Large2019-08-15
Injecting Numerical Reasoning Skills into Language Models✓ Link72.4GenBERT (+ND+TD)2020-04-09
NumNet: Machine Reading Comprehension with Numerical Reasoning✓ Link67.97NumNet2019-10-15
GPT-4 Technical Report✓ Link64.1GPT 3.5 (few-shot, k=3)2023-03-15
Orca 2: Teaching Small Language Models How to Reason60.26Orca 2-7B2023-11-18
Orca 2: Teaching Small Language Models How to Reason57.97Orca 2-13B2023-11-18
DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs✓ Link47.01NAQA Net2019-03-01
Language Models are Few-Shot Learners✓ Link36.5GPT-3 175B (few-shot, k=32)2020-05-28
DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs✓ Link32.7BERT2019-03-01