automated-theorem-proving-on-minif2f-test

Mathematical ProofsAutomated Theorem Proving

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	cumulative	Pass@1	Pass@32	Pass@64	Pass@100	ITP	pass@1024	pass@8192	ModelName	ReleaseDate
Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning	✓ Link	80.74	52.94	68.85			Lean	77.87	80.74	Kimina-Prover-Preview	2025-04-15
Efficient Neural Theorem Proving via Fine-grained Proof Structure Analysis	✓ Link	66.0	36.5			52.5	Isabelle			ProofAug	2025-01-30
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search	✓ Link	63.5		50.0	50.7		Lean			DeepSeek-Prover-V1.5	2024-08-15
SubgoalXL: Subgoal-based Expert Learning for Theorem Proving	✓ Link	56.1		39.3			Isabelle			Subgoal-XL	2024-08-20
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data		52.0	30.0		46.3		Lean			DeepSeek-Prover	2024-05-23
Lyra: Orchestrating Dual Correction in Automated Theorem Proving	✓ Link	47.1				47.1	Isabelle			Lyra + GPT-4	2023-09-27
LEGO-Prover: Neural Theorem Proving with Growing Libraries	✓ Link	47.1				47.1	Isabelle			LEGO-Prover ChatGPT	2023-10-01
Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving	✓ Link	45.5				45.5	Isabelle			Decomposing the Enigma	2023-05-25
HyperTree Proof Search for Neural Theorem Proving		41			41		Lean			Evariste	2022-05-23
HyperTree Proof Search for Neural Theorem Proving		40.6			40.6		Lean			Evariste-7d	2022-05-23
HyperTree Proof Search for Neural Theorem Proving		38.9			38.9		Lean			Evariste-1d	2022-05-23
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs	✓ Link	38.9				38.9	Isabelle			DSP (540B Minerva informal)	2022-10-21
Formal Mathematics Statement Curriculum Learning	✓ Link	36.6	29.6	34.5	36.6		Lean			Lean Expert Iteration	2022-02-03
HyperTree Proof Search for Neural Theorem Proving		36.6			36.6		Metamath			GPT-f	2022-05-23
[]()		35.2	35.2				Isabelle			Thor + expert iteration on autoformalised theorems
An In-Context Learning Agent for Formal Theorem-Proving	✓ Link	30.7	30.7				Lean			COPRA + GPT-4-turbo	2023-10-06
Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers		29.9	29.9				Isabelle			Thor	2022-05-22
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics	✓ Link	29.2	24.6	29.2			Lean			Lean GPT-f	2021-08-31
An Empirical Study of Data Ability Boundary in LLMs' Math Reasoning	✓ Link	28.3	28.3				Lean			MMOS-DeepSeekMath-7B	2024-02-23
[]()		26.5		26.5			Lean			ReProver
Llemma: An Open Language Model For Mathematics	✓ Link	26.2		26.2			Lean			LLEMMA-7b	2023-10-16
Llemma: An Open Language Model For Mathematics	✓ Link	25.8		25.8			Lean			LLEMMA-34b	2023-10-16
Proof Artifact Co-training for Theorem Proving with Language Models	✓ Link	24.6	24.6				Isabelle			PACT (reproduced by Thor)	2021-02-11
An In-Context Learning Agent for Formal Theorem-Proving	✓ Link	23.3	23.3				Lean			COPRA + GPT-4	2023-10-06
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs	✓ Link	20.9	20.9				Isabelle			Sledgehammer + heuristics	2022-10-21
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics	✓ Link	18	18				Lean			Lean tidy	2021-08-31
An In-Context Learning Agent for Formal Theorem-Proving	✓ Link	11.9	11.9				Lean			COPRA + GPT-3.5	2023-10-06
Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers		10.4	10.4				Isabelle			Sledgehammer	2022-05-22
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics	✓ Link	1.6	1.3				Metamath			Metamath GPT-f	2021-08-31

OpenCodePapers

automated-theorem-proving-on-minif2f-test