OpenCodePapers

mathematical-reasoning-on-lila-ood

Mathematical Reasoning

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	Accuracy	ModelName	ReleaseDate
Lila: A Unified Benchmark for Mathematical Reasoning	✓ Link	0.586	Codex (Few-Shot, 175B)	2022-10-31
Lila: A Unified Benchmark for Mathematical Reasoning	✓ Link	0.448	Bhāskara-P (Fine-tuned, 2.7B)	2022-10-31
Lila: A Unified Benchmark for Mathematical Reasoning	✓ Link	0.384	GPT-3 (Few-Shot, 175B)	2022-10-31
Lila: A Unified Benchmark for Mathematical Reasoning	✓ Link	0.268	Bhāskara-A (Fine-tuned, 2.7B)	2022-10-31
Lila: A Unified Benchmark for Mathematical Reasoning	✓ Link	0.238	Neo-P (Fine-tuned, 2.7B)	2022-10-31
Lila: A Unified Benchmark for Mathematical Reasoning	✓ Link	0.177	Neo-A (Fine-tuned, 2.7B)	2022-10-31