OpenCodePapers

mathematical-reasoning-on-lila-iid

Mathematical Reasoning
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeAccuracyModelNameReleaseDate
Lila: A Unified Benchmark for Mathematical Reasoning✓ Link0.604Codex (Few-Shot, 175B)2022-10-31
Lila: A Unified Benchmark for Mathematical Reasoning✓ Link0.48Bhāskara-P (Fine-tuned, 2.7B)2022-10-31
Lila: A Unified Benchmark for Mathematical Reasoning✓ Link0.394Neo-P (Fine-tuned, 2.7B)2022-10-31
Lila: A Unified Benchmark for Mathematical Reasoning✓ Link0.384GPT-3 (Few-Shot, 175B)2022-10-31
Lila: A Unified Benchmark for Mathematical Reasoning✓ Link0.252Bhāskara-A (Fine-tuned, 2.7B)2022-10-31
Lila: A Unified Benchmark for Mathematical Reasoning✓ Link0.204Neo-A (Fine-tuned, 2.7B)2022-10-31