Planning-Driven Programming: A Large Language Model Programming Workflow | ✓ Link | 87.2 | 65.2 | 34.8 | | | | | | | | | | 62.6 | LPW (GPT-4o) | 2024-11-21 |
MoTCoder: Elevating Large Language Models with Modular of Thought for Challenging Programming Tasks | ✓ Link | 68.44 | 44.49 | 27.84 | | | | | | | | | | | MoTCoder-32B-V1.5 | 2023-12-26 |
MoTCoder: Elevating Large Language Models with Modular of Thought for Challenging Programming Tasks | ✓ Link | 54.26 | 32.63 | 21.18 | | | | | | | | | | | MoTCoder-7B-V1.5 | 2023-12-26 |
CodeT: Code Generation with Generated Tests | ✓ Link | 47.3% | 14.3% | 6.2% | | | | | | | | | | | code-davinci-002 175B (CodeT) | 2022-07-21 |
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence | ✓ Link | 33.80 | 19.70 | 11.09 | | | | | | | | | | | deepseek-ai/deepseek-coder-6.7b-instruct | 2024-01-25 |
CodeT: Code Generation with Generated Tests | ✓ Link | 31.92 | | | | | | | | | | | | | code-davinci-002 175B | 2022-07-21 |
CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules | ✓ Link | 29.3% | 6.4% | 2.5% | 14.5% | 25.4% | 60.9% | | | | | | | | CodeChain+WizardCoder-15b | 2023-10-13 |
CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules | ✓ Link | 26.29 | 7.49 | 3.75 | | | | | | | | | | | WizardCoder-15b | 2023-10-13 |
CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging | ✓ Link | 26.04 | 4.21 | 0.81 | | | | | | | | | | | CodeSim (GPT4) | 2025-02-08 |
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning | ✓ Link | 20 | 13.5 | 33.3 | | | | | | | | | | | CodeRL+CodeT5 | 2022-07-05 |
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning | ✓ Link | 6.77% | 1.80% | 0.69% | 15.70% | 14.33% | 38.10% | 2.36% | 4.48% | 15.27% | 15.70% | 14.33% | 38.10% | | GPT-J 6B (Finetuned) | 2022-07-05 |
Evaluating Large Language Models Trained on Code | ✓ Link | 5.60% | 1.00% | 0.50% | 13.51% | 13.15% | 35.20% | 1.00% | 1.73% | 9.20% | 13.51% | 13.15% | 35.20% | | Codex 12B (Raw) | 2021-07-07 |
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning | ✓ Link | 4.14% | 0.14% | 0.02% | 3.32% | 3.70% | 25.02% | 0.09% | 0.51% | 9.65% | 3.23% | 3.70% | 25.02% | | GPT-Neo 2.7B (Finetuned) | 2022-07-05 |
Measuring Coding Challenge Competence With APPS | ✓ Link | 3.90% | 0.57% | 0.00% | 11.40% | 9.83% | 27.90% | 0.00% | 0.80% | 5.50% | 11.40% | 9.83% | 27.90% | | GPT-Neo 2.7B | 2021-05-20 |
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning | ✓ Link | 3.90% | 0.57% | 0.00% | 0.0% | 0.80% | 5.50% | 0.00% | 0.80% | 5.50% | | | | | GPT2 1.5B (Finetuned) | 2022-07-05 |
MapCoder: Multi-Agent Code Generation for Competitive Problem Solving | ✓ Link | 1.30% | 0.70% | 0.00% | 8.80% | 9.27% | 25.00% | 0.00% | 1.03% | 3.60% | 8.80% | 9.27% | 25.00% | | MapCoder APPS-150-cherrypicked (GPT-4) | 2024-05-18 |
Competition-Level Code Generation with AlphaCode | ✓ Link | | | | 22.0 | | | | | | | | | | AlphaCode 1B Filtered from 50000 | 2022-02-08 |
Competition-Level Code Generation with AlphaCode | ✓ Link | | | | 7.75% | 9.66% | 20.36% | 7.75% | 9.66% | 20.36% | | | | | AlphaCode 1B | 2022-02-08 |