mathematical-reasoning-on-frontiermath

Mathematical Reasoning

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Accuracy	ModelName	ReleaseDate
[]()	0.252	o3
FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI	0.02	Gemini 1.5 Pro (002)	2024-11-07
[]()	0.01	Claude 3.5 Sonnet
[]()	0.01	o1-preview
[]()	0.01	o1-mini
[]()	0.01	GPT-4o

OpenCodePapers