Language Models not just for Pre-training: Fast Online Neural Noisy Channel Modeling | ✓ Link | 40.3 | | fast-noisy-channel-modeling | 2020-11-13 |
Finetuned Language Models Are Zero-Shot Learners | ✓ Link | 38.1 | | FLAN 137B (few-shot, k=9) | 2021-09-03 |
Finetuned Language Models Are Zero-Shot Learners | ✓ Link | 37.3 | | FLAN 137B (zero-shot) | 2021-09-03 |
Cross-lingual Language Model Pretraining | ✓ Link | 35.3 | | MLM pretraining | 2019-01-22 |
GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators | ✓ Link | 33.5 | | GenTranslate | 2024-02-10 |
Edinburgh Neural Machine Translation Systems for WMT 16 | ✓ Link | 33.3 | | Attentional encoder-decoder + BPE | 2016-06-09 |
Levenshtein Transformer | ✓ Link | 33.26 | | Levenshtein Transformer (distillation) | 2019-05-27 |
Incorporating a Local Translation Mechanism into Non-autoregressive Translation | ✓ Link | 33.26 | | CMLM+LAT+4 iterations | 2020-11-12 |
Adaptively Sparse Transformers | ✓ Link | 33.1 | | Adaptively Sparse Transformer (1.5-entmax) | 2019-08-30 |
Alleviating the Inequality of Attention Heads for Neural Machine Translation | | 32.95 | | HeadMask (Impt-18) | 2020-09-21 |
FlowSeq: Non-Autoregressive Conditional Sequence Generation with Generative Flow | ✓ Link | 32.91 | | FlowSeq-large (NPD n = 30) | 2019-09-05 |
Adaptively Sparse Transformers | ✓ Link | 32.89 | | Adaptively Sparse Transformer (alpha-entmax) | 2019-08-30 |
Alleviating the Inequality of Attention Heads for Neural Machine Translation | | 32.85 | | HeadMask (Random-18) | 2020-09-21 |
FlowSeq: Non-Autoregressive Conditional Sequence Generation with Generative Flow | ✓ Link | 32.46 | | FlowSeq-large (NPD n = 15) | 2019-09-05 |
FlowSeq: Non-Autoregressive Conditional Sequence Generation with Generative Flow | ✓ Link | 32.03 | | FlowSeq-large (IWD n = 15) | 2019-09-05 |
Non-Autoregressive Neural Machine Translation | ✓ Link | 31.44 | | NAT +FT + NPD | 2017-11-07 |
Incorporating a Local Translation Mechanism into Non-autoregressive Translation | ✓ Link | 31.24 | | CMLM+LAT+1 iterations | 2020-11-12 |
FlowSeq: Non-Autoregressive Conditional Sequence Generation with Generative Flow | ✓ Link | 30.69 | | FlowSeq-large | 2019-09-05 |
Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement | ✓ Link | 30.30 | | Denoising autoencoders (non-autoregressive) | 2018-02-19 |
FlowSeq: Non-Autoregressive Conditional Sequence Generation with Generative Flow | ✓ Link | 30.16 | | FlowSeq-base | 2019-09-05 |
TextBox 2.0: A Text Generation Library with Pre-trained Language Models | ✓ Link | | 37.48 | BART (TextBox 2.0) | 2022-12-26 |