OpenCodePapers

mathematical-reasoning-on-frontiermath

Mathematical Reasoning
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeAccuracyModelNameReleaseDate
[]()0.252o3
FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI0.02Gemini 1.5 Pro (002)2024-11-07
[]()0.01Claude 3.5 Sonnet
[]()0.01o1-preview
[]()0.01o1-mini
[]()0.01GPT-4o