Multi-Agent Dual Learning | | 40.68 | | MADL | 2019-05-01 |
Edinburgh Neural Machine Translation Systems for WMT 16 | ✓ Link | 34.2 | | Attentional encoder-decoder + BPE | 2016-06-09 |
Linguistic Input Features Improve Neural Machine Translation | ✓ Link | 28.4 | | Linguistic Input Features | 2016-06-09 |
DeLighT: Deep and Light-weight Transformer | ✓ Link | 28.0 | | DeLighT | 2020-08-03 |
Finetuned Language Models Are Zero-Shot Learners | ✓ Link | 27.0 | | FLAN 137B (zero-shot) | 2021-09-03 |
On the adequacy of untuned warmup for adaptive optimization | ✓ Link | 26.7 | | Transformer | 2019-10-09 |
Finetuned Language Models Are Zero-Shot Learners | ✓ Link | 26.1 | | FLAN 137B (few-shot, k=11) | 2021-09-03 |
Exploiting Semantics in Neural Machine Translation with Graph Convolutional Networks | | 24.9 | | BiRNN + GCN (Syn + Sem) | 2018-04-23 |
Unsupervised Statistical Machine Translation | ✓ Link | 18.23 | | SMT + iterative backtranslation (unsupervised) | 2018-09-04 |
Unsupervised Neural Machine Translation with Weight Sharing | ✓ Link | 10.86 | | Unsupervised NMT + weight-sharing | 2018-04-24 |
Unsupervised Machine Translation Using Monolingual Corpora Only | ✓ Link | 9.64 | | Unsupervised S2S with attention | 2017-10-31 |
Exploiting Monolingual Data at Scale for Neural Machine Translation | | | 40.9 | Exploiting Mono at Scale (single) | 2019-11-01 |