OpenCodePapers

language-modelling-on-hutter-prize

Language Modelling
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeBit per Character (BPC)Number of paramsModelNameReleaseDate
Dynamic Evaluation of Transformer Language Models✓ Link0.94277MTransformer-XL + RMS dynamic eval2019-04-17
Compressive Transformers for Long-Range Sequence Modelling✓ Link0.97Compressive Transformer2019-11-13
Mogrifier LSTM✓ Link0.98896MMogrifier LSTM + dynamic eval2019-09-04
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context✓ Link0.99277M24-layer Transformer-XL2019-01-09
Longformer: The Long-Document Transformer✓ Link0.99102MLongformer Large2020-04-10
Longformer: The Long-Document Transformer✓ Link1.0041MLongformer Small2020-04-10
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context✓ Link1.0388M18-layer Transformer-XL2019-01-09
Character-Level Language Modeling with Deeper Self-Attention✓ Link1.06235M64-layer Character Transformer Model2018-08-09
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context✓ Link1.0641M12-layer Transformer-XL2019-01-09
Dynamic Evaluation of Neural Sequence Models✓ Link1.0846MmLSTM + dynamic eval2017-09-21
Character-Level Language Modeling with Deeper Self-Attention✓ Link1.1144M12-layer Character Transformer Model2018-08-09
Mogrifier LSTM✓ Link1.12296MMogrifier LSTM2019-09-04
An Analysis of Neural Language Modeling at Multiple Scales✓ Link1.23247M3-layer AWD-LSTM2018-03-22
Multiplicative LSTM for sequence modelling✓ Link1.2446MLarge mLSTM +emb +WN +VD2016-09-26
Fast-Slow Recurrent Neural Networks✓ Link1.24547MLarge FS-LSTM-42017-05-24
Recurrent Highway Networks✓ Link1.2746MLarge RHN2016-07-12
Fast-Slow Recurrent Neural Networks✓ Link1.27727MFS-LSTM-42017-05-24
Recurrent Highway Networks✓ Link1.31RHN - depth 5 [zilly2016recurrent]2016-07-12