OpenCodePapers

automated-theorem-proving-on-minif2f-test

Mathematical ProofsAutomated Theorem Proving
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodecumulativePass@1Pass@32Pass@64Pass@100ITPpass@1024pass@8192ModelNameReleaseDate
Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning✓ Link80.7452.9468.85Lean77.8780.74Kimina-Prover-Preview2025-04-15
Efficient Neural Theorem Proving via Fine-grained Proof Structure Analysis✓ Link66.036.552.5IsabelleProofAug2025-01-30
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search✓ Link63.550.050.7LeanDeepSeek-Prover-V1.52024-08-15
SubgoalXL: Subgoal-based Expert Learning for Theorem Proving✓ Link56.139.3IsabelleSubgoal-XL2024-08-20
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data52.030.046.3LeanDeepSeek-Prover2024-05-23
Lyra: Orchestrating Dual Correction in Automated Theorem Proving✓ Link47.147.1IsabelleLyra + GPT-42023-09-27
LEGO-Prover: Neural Theorem Proving with Growing Libraries✓ Link47.147.1IsabelleLEGO-Prover ChatGPT2023-10-01
Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving✓ Link45.545.5IsabelleDecomposing the Enigma2023-05-25
HyperTree Proof Search for Neural Theorem Proving4141LeanEvariste2022-05-23
HyperTree Proof Search for Neural Theorem Proving40.640.6LeanEvariste-7d2022-05-23
HyperTree Proof Search for Neural Theorem Proving38.938.9LeanEvariste-1d2022-05-23
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs✓ Link38.938.9IsabelleDSP (540B Minerva informal)2022-10-21
Formal Mathematics Statement Curriculum Learning✓ Link36.629.634.536.6LeanLean Expert Iteration2022-02-03
HyperTree Proof Search for Neural Theorem Proving36.636.6MetamathGPT-f2022-05-23
[]()35.235.2IsabelleThor + expert iteration on autoformalised theorems
An In-Context Learning Agent for Formal Theorem-Proving✓ Link30.730.7LeanCOPRA + GPT-4-turbo2023-10-06
Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers29.929.9IsabelleThor2022-05-22
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics✓ Link29.224.629.2LeanLean GPT-f2021-08-31
An Empirical Study of Data Ability Boundary in LLMs' Math Reasoning✓ Link28.328.3LeanMMOS-DeepSeekMath-7B2024-02-23
[]()26.526.5LeanReProver
Llemma: An Open Language Model For Mathematics✓ Link26.226.2LeanLLEMMA-7b2023-10-16
Llemma: An Open Language Model For Mathematics✓ Link25.825.8LeanLLEMMA-34b2023-10-16
Proof Artifact Co-training for Theorem Proving with Language Models✓ Link24.624.6IsabellePACT (reproduced by Thor)2021-02-11
An In-Context Learning Agent for Formal Theorem-Proving✓ Link23.323.3LeanCOPRA + GPT-42023-10-06
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs✓ Link20.920.9IsabelleSledgehammer + heuristics2022-10-21
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics✓ Link1818LeanLean tidy2021-08-31
An In-Context Learning Agent for Formal Theorem-Proving✓ Link11.911.9LeanCOPRA + GPT-3.52023-10-06
Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers10.410.4IsabelleSledgehammer2022-05-22
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics✓ Link1.61.3MetamathMetamath GPT-f2021-08-31