| Paper | Code | BLEU | ModelName | ReleaseDate |
|---|---|---|---|---|
| Incorporating BERT into Neural Machine Translation | ✓ Link | 38.27 | BERT-fused NMT | 2020-02-17 |
| MASS: Masked Sequence to Sequence Pre-training for Language Generation | ✓ Link | 37.5 | MASS (6-layer Transformer) | 2019-05-07 |
| An Effective Approach to Unsupervised Machine Translation | ✓ Link | 36.2 | SMT + NMT (tuning and joint refinement) | 2019-02-04 |
| Cross-lingual Language Model Pretraining | ✓ Link | 33.4 | MLM pretraining for encoder and decoder | 2019-01-22 |
| Language Models are Few-Shot Learners | ✓ Link | 32.6 | GPT-3 175B (Few-Shot) | 2020-05-28 |
| Unsupervised Neural Machine Translation with SMT as Posterior Regularization | ✓ Link | 29.5 | SMT as posterior regularization | 2019-01-14 |
| Phrase-Based & Neural Unsupervised Machine Translation | ✓ Link | 27.6 | PBSMT + NMT | 2018-04-20 |