OpenCodePapers

math-word-problem-solving-on-svamp

Mathematical ReasoningMath Word Problem Solving
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeExecution AccuracyAccuracyModelNameReleaseDate
Teaching-Inspired Integrated Prompting Framework: A Novel Approach for Enhancing Reasoning in Large Language Models✓ Link93.9GPT-4 (Teaching-Inspired)2024-10-10
Automatic Model Selection with Large Language Models for Reasoning✓ Link93.7GPT-4 (Model Selection)2023-05-23
[]()92.3Qwen2(CoT + Code Interpreter)
Progressive-Hint Prompting Improves Reasoning in Large Language Models✓ Link91.9GPT-4 (PHP)2023-04-19
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset✓ Link87.8OpenMath-CodeLlama-70B (w/ code)2024-02-15
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning✓ Link84.9MathCoder-L-70B2023-10-05
Does ChatGPT Comprehend the Place Value in Numbers When Solving Math Word Problems?✓ Link83.70PoT_Eng (self-consistency @ 5)2023-06-03
Does ChatGPT Comprehend the Place Value in Numbers When Solving Math Word Problems?✓ Link82.50CoT_Eng (self-consistency @ 5)2023-06-03
An Empirical Study of Data Ability Boundary in LLMs' Math Reasoning✓ Link80.6MMOS-CODE-34B(0-shot)2024-02-23
An Empirical Study of Data Ability Boundary in LLMs' Math Reasoning✓ Link79.3MMOS-DeepSeekMath-7B(0-shot)2024-02-23
An Empirical Study of Data Ability Boundary in LLMs' Math Reasoning✓ Link76.4MMOS-CODE-7B(0-shot)2024-02-23
Llama 2: Open Foundation and Fine-Tuned Chat Models✓ Link69.2LLaMA 2-Chat2023-07-18
Math Word Problem Solving by Generating Linguistic Variants of Problem Statements✓ Link63.563.5DeBERTa2023-06-24
Large Language Models are Zero-Shot Reasoners✓ Link62.1PaLM (zero-shot, CoT)2022-05-24
Large Language Models are Zero-Shot Reasoners✓ Link58.8PaLM (zero-shot)2022-05-24
Frugal LMs Trained to Invoke Symbolic Solvers Achieve Parameter-Efficient Arithmetic Reasoning✓ Link56.65SYRELM (Vicuna 13B)2023-12-09
ATHENA: Mathematical Reasoning with Thought Expansion✓ Link54.8ATHENA (roberta-large)2023-11-02
Learning Multi-Step Reasoning by Solving Arithmetic Tasks✓ Link48.9MsAT-DeductReasoner2023-06-02
Learning to Reason Deductively: Math Word Problem Solving as Complex Relation Extraction✓ Link47.3Roberta-DeductReasoner2022-03-19
ATHENA: Mathematical Reasoning with Thought Expansion✓ Link45.6ATHENA (roberta-base)2023-11-02
Are NLP Models really able to Solve Simple Math Word Problems?✓ Link43.843.8Graph2Tree with RoBERTa2021-03-12
Are NLP Models really able to Solve Simple Math Word Problems?✓ Link41.041.0GTS with RoBERTa2021-03-12
Are NLP Models really able to Solve Simple Math Word Problems?✓ Link40.340.3LSTM Seq2Seq with RoBERTa2021-03-12
Frugal LMs Trained to Invoke Symbolic Solvers Achieve Parameter-Efficient Arithmetic Reasoning✓ Link40.1SYRELM (GPT-J)2023-12-09
Are NLP Models really able to Solve Simple Math Word Problems?✓ Link38.938.9Transformer with RoBERTa2021-03-12
Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Better Solvers for Math Word Problems✓ Link94.2GPT-4 DUP2024-04-23