OpenCodePapers

machine-translation-on-wmt2014-english-french

Machine Translation
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeBLEU scoreSacreBLEUHardware BurdenOperations per network passModelNameReleaseDate
Very Deep Transformers for Neural Machine Translation✓ Link46.444.4Transformer+BT (ADMIN init)2020-08-18
Understanding Back-Translation at Scale✓ Link45.643.8180GNoisy back-translation2018-08-28
Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information✓ Link44.341.7mRASP+Fine-Tune2020-10-07
R-Drop: Regularized Dropout for Neural Networks✓ Link43.95Transformer + R-Drop2021-06-28
Very Deep Transformers for Neural Machine Translation✓ Link43.841.8Transformer (ADMIN init)2020-08-18
Understanding the Difficulty of Training Transformers✓ Link43.8Admin2020-04-17
Incorporating BERT into Neural Machine Translation✓ Link43.78BERT-fused NMT2020-02-17
MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning✓ Link43.5MUSE(Paralllel Multi-scale Attention)2019-11-17
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer✓ Link43.4T52019-10-23
Joint Source-Target Self Attention with Locality Constraints✓ Link43.3Local Joint Self-attention2019-05-16
Depth Growing for Neural Machine Translation✓ Link43.2724GDepth Growing2019-07-03
Scaling Neural Machine Translation✓ Link43.255GTransformer Big2018-06-01
Pay Less Attention with Lightweight and Dynamic Convolutions✓ Link43.2DynamicConv2019-01-29
Time-aware Large Kernel Convolutions✓ Link43.2TaLK Convolutions2020-02-08
Pay Less Attention with Lightweight and Dynamic Convolutions✓ Link43.1LightConv2019-01-29
Learning to Encode Position for Transformer with Continuous Dynamical Model✓ Link42.7FLOATER-large2020-03-13
OmniNet: Omnidirectional Representations from Transformers✓ Link42.6OmniNetP2021-03-01
Fast and Simple Mixture of Softmaxes with BPE and Hybrid-LightRNN for Language Generation✓ Link42.1Transformer Big + MoS2018-09-25
Finetuning Pretrained Transformers into RNNs✓ Link42.1T2R + Pretrain2021-03-24
Synthesizer: Rethinking Self-Attention in Transformer Models✓ Link41.85Synthesizer (Random + Vanilla)2020-05-02
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing✓ Link41.8Hardware Aware Transformer2020-05-28
Self-Attention with Relative Position Representations✓ Link41.5Transformer (big) + Relative Position Representations2018-03-06
Deliberation Networks: Sequence Generation Beyond One-Pass Decoding41.5Stack 4-layer RNNSearch + Dual Learning + Deliberation Network2017-12-01
Weighted Transformer Network for Machine Translation✓ Link41.4Weighted Transformer (large)2017-11-06
Convolutional Sequence to Sequence Learning✓ Link41.3ConvS2S (ensemble)2017-05-08
The Evolved Transformer✓ Link41.3Evolved Transformer Big2019-01-30
The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation✓ Link41.0132G2.81GRNMT+2018-04-26
Attention Is All You Need✓ Link41.023G2300000000.0GTransformer Big2017-06-12
The Evolved Transformer✓ Link40.6Evolved Transformer Base2019-01-30
ResMLP: Feedforward networks for image classification with data-efficient training✓ Link40.6ResMLP-122021-05-07
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer✓ Link40.56142GMoE2017-01-23
Memory-Efficient Adaptive Optimization✓ Link40.5Transformer2019-01-30
Convolutional Sequence to Sequence Learning✓ Link40.46143GConvS2S2017-05-08
ResMLP: Feedforward networks for image classification with data-efficient training✓ Link40.3ResMLP-62021-05-07
AutoDropout: Learning Dropout Patterns to Regularize Deep Networks✓ Link40TransformerBase + AutoDropout2021-01-05
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation✓ Link39.9279GGNMT+RL2016-09-26
Lite Transformer with Long-Short Range Attention✓ Link39.6Lite Transformer2020-04-24
Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation✓ Link39.2119GDeep-Att + PosUnk2016-06-14
Random Feature Attention39.2Rfa-Gate-arccos2021-03-03
Attention Is All You Need✓ Link38.123G330000000.0GTransformer Base2017-06-12
Addressing the Rare Word Problem in Neural Machine Translation✓ Link37.5LSTM6 + PosUnk2014-10-30
[]()37PBMT
Sequence to Sequence Learning with Neural Networks✓ Link36.5SMT+LSTM52014-09-10
Neural Machine Translation by Jointly Learning to Align and Translate✓ Link36.2RNN-search50*2014-09-01
Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation✓ Link35.9Deep-Att2016-06-14
A Convolutional Encoder Model for Neural Machine Translation✓ Link35.7Deep Convolutional Encoder; single-layer decoder2016-11-07
Sequence to Sequence Learning with Neural Networks✓ Link34.8LSTM2014-09-10
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation✓ Link34.54CSLM + RNN + WP2014-06-03
Finetuned Language Models Are Zero-Shot Learners✓ Link33.9FLAN 137B (zero-shot)2021-09-03
Finetuned Language Models Are Zero-Shot Learners✓ Link33.8FLAN 137B (few-shot, k=9)2021-09-03
Recurrent Neural Network Regularization✓ Link29.03Regularized LSTM2014-09-08
Phrase-Based & Neural Unsupervised Machine Translation✓ Link28.11Unsupervised PBSMT2018-04-20
Phrase-Based & Neural Unsupervised Machine Translation✓ Link27.6PBSMT + NMT2018-04-20
Can Active Memory Replace Attention?✓ Link26.4GRU+Attention2016-10-27
Unsupervised Statistical Machine Translation✓ Link26.22SMT + iterative backtranslation (unsupervised)2018-09-04
Phrase-Based & Neural Unsupervised Machine Translation✓ Link25.14Unsupervised NMT + Transformer2018-04-20
Unsupervised Neural Machine Translation✓ Link14.36Unsupervised attentional encoder-decoder + BPE2017-10-30