OpenCodePapers

mathematical-reasoning-on-lila-iid

Mathematical Reasoning

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	Accuracy	ModelName	ReleaseDate
Lila: A Unified Benchmark for Mathematical Reasoning	✓ Link	0.604	Codex (Few-Shot, 175B)	2022-10-31
Lila: A Unified Benchmark for Mathematical Reasoning	✓ Link	0.48	Bhāskara-P (Fine-tuned, 2.7B)	2022-10-31
Lila: A Unified Benchmark for Mathematical Reasoning	✓ Link	0.394	Neo-P (Fine-tuned, 2.7B)	2022-10-31
Lila: A Unified Benchmark for Mathematical Reasoning	✓ Link	0.384	GPT-3 (Few-Shot, 175B)	2022-10-31
Lila: A Unified Benchmark for Mathematical Reasoning	✓ Link	0.252	Bhāskara-A (Fine-tuned, 2.7B)	2022-10-31
Lila: A Unified Benchmark for Mathematical Reasoning	✓ Link	0.204	Neo-A (Fine-tuned, 2.7B)	2022-10-31