OpenCodePapers

language-modelling-on-c4

Language Modelling
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodePerplexityTPUv3 HoursStepsModelNameReleaseDate
Primer: Searching for Efficient Transformers for Language Modeling✓ Link12.3517.3K1MPrimer2021-09-17
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale✓ Link12.45Zeropoint LLM.int8 13B (vector-wise + decomp)2022-08-15
Primer: Searching for Efficient Transformers for Language Modeling✓ Link12.6916.5K1MT5++2021-09-17
Primer: Searching for Efficient Transformers for Language Modeling✓ Link13.2515.7K1MOriginal T52021-09-17
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale✓ Link13.3LLM.float32 6.7B2022-08-15
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale✓ Link14.43LLM.float32 2.7B2022-08-15
N-Grammer: Augmenting Transformers with latent n-grams✓ Link14.79N-Grammer 343M2022-07-13
N-Grammer: Augmenting Transformers with latent n-grams✓ Link15.01N-Grammer 288M2022-07-13
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale✓ Link15.91LLM.float32 1.3B2022-08-15