OpenCodePapers
language-modelling-on-c4
Language Modelling
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
Show papers without code
Paper
Code
Perplexity
↕
TPUv3 Hours
↕
Steps
↕
ModelName
ReleaseDate
↕
Primer: Searching for Efficient Transformers for Language Modeling
✓ Link
12.35
17.3K
1M
Primer
2021-09-17
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
✓ Link
12.45
Zeropoint LLM.int8 13B (vector-wise + decomp)
2022-08-15
Primer: Searching for Efficient Transformers for Language Modeling
✓ Link
12.69
16.5K
1M
T5++
2021-09-17
Primer: Searching for Efficient Transformers for Language Modeling
✓ Link
13.25
15.7K
1M
Original T5
2021-09-17
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
✓ Link
13.3
LLM.float32 6.7B
2022-08-15
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
✓ Link
14.43
LLM.float32 2.7B
2022-08-15
N-Grammer: Augmenting Transformers with latent n-grams
✓ Link
14.79
N-Grammer 343M
2022-07-13
N-Grammer: Augmenting Transformers with latent n-grams
✓ Link
15.01
N-Grammer 288M
2022-07-13
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
✓ Link
15.91
LLM.float32 1.3B
2022-08-15