Paper | Code | Pass@1 | ModelName | ReleaseDate |
---|---|---|---|---|
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging | ✓ Link | 100 | DeepSeek-R1 (MGDebugger) | 2024-10-02 |
Debug like a Human: A Large Language Model Debugger via Verifying Runtime Execution Step-by-step | ✓ Link | 99.4 | LLaMA 3 | 2024-02-25 |
QualityFlow: An Agentic Workflow for Program Synthesis Controlled by LLM Quality Checks | 98.8 | QualityFlow (Sonnet-3.5) | 2025-01-20 | |
Planning-Driven Programming: A Large Language Model Programming Workflow | ✓ Link | 98.2 | Phi-2 | 2024-11-21 |
Execution Guided Line-by-Line Code Generation | ✓ Link | 96.95 | EG-CFG (DeepSeek-V3-0324) | 2025-06-12 |
MapCoder: Multi-Agent Code Generation for Competitive Problem Solving | ✓ Link | 93.9 | Mistral 7B | 2024-05-18 |
[]() | 90.85 | Claude Sonnet 3.5 | ||
L2MAC: Large Language Model Automatic Computer for Extensive Code Generation | ✓ Link | 90.2 | L2MAC (GPT-4) | 2023-10-02 |