| Paper | Code | EM | ModelName | ReleaseDate |
|---|---|---|---|---|
| Cluster-Former: Clustering-based Sparse Transformer for Long-Range Dependency Encoding | 54 | Cluster-Former (#C=512) | 2020-09-13 | |
| Reformer: The Efficient Transformer | ✓ Link | 53.2 | Locality-Sensitive Hashing | 2020-01-13 |
| Generating Long Sequences with Sparse Transformers | ✓ Link | 52.1 | Sparse Attention | 2019-04-23 |
| Multi-passage BERT: A Globally Normalized BERT Model for Open-domain Question Answering | 51.1 | Multi-passage BERT | 2019-08-22 | |
| Denoising Distantly Supervised Open-Domain Question Answering | ✓ Link | 42.2 | Denoising QA | 2018-07-01 |
| Densely Connected Attention Propagation for Reading Comprehension | ✓ Link | 38.6 | DECAPROP | 2018-11-10 |
| Reading Wikipedia to Answer Open-Domain Questions | ✓ Link | 37.7 | DrQA | 2017-03-31 |