| Paper | Code | BLEU | ModelName | ReleaseDate |
|---|---|---|---|---|
| Language Models are Few-Shot Learners | ✓ Link | 29.7 | GPT-3 175B (Few-Shot) | 2020-05-28 |
| MASS: Masked Sequence to Sequence Pre-training for Language Generation | ✓ Link | 28.3 | MASS (6-layer Transformer) | 2019-05-07 |
| An Effective Approach to Unsupervised Machine Translation | ✓ Link | 26.9 | SMT + NMT (tuning and joint refinement) | 2019-02-04 |
| Cross-lingual Language Model Pretraining | ✓ Link | 26.4 | MLM pretraining for encoder and decoder | 2019-01-22 |
| Unsupervised Neural Machine Translation with SMT as Posterior Regularization | ✓ Link | 21.7 | SMT as posterior regularization | 2019-01-14 |
| Phrase-Based & Neural Unsupervised Machine Translation | ✓ Link | 20.2 | PBSMT + NMT | 2018-04-20 |
| Unsupervised Neural Machine Translation Initialized by Unsupervised Statistical Machine Translation | 20.0 | Synthetic bilingual data init | 2018-10-30 |