math-word-problem-solving-on-svamp

Mathematical ReasoningMath Word Problem Solving

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	Execution Accuracy	Accuracy	ModelName	ReleaseDate
Teaching-Inspired Integrated Prompting Framework: A Novel Approach for Enhancing Reasoning in Large Language Models	✓ Link	93.9		GPT-4 (Teaching-Inspired)	2024-10-10
Automatic Model Selection with Large Language Models for Reasoning	✓ Link	93.7		GPT-4 (Model Selection)	2023-05-23
[]()		92.3		Qwen2(CoT + Code Interpreter)
Progressive-Hint Prompting Improves Reasoning in Large Language Models	✓ Link	91.9		GPT-4 (PHP)	2023-04-19
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset	✓ Link	87.8		OpenMath-CodeLlama-70B (w/ code)	2024-02-15
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning	✓ Link	84.9		MathCoder-L-70B	2023-10-05
Does ChatGPT Comprehend the Place Value in Numbers When Solving Math Word Problems?	✓ Link	83.70		PoT_Eng (self-consistency @ 5)	2023-06-03
Does ChatGPT Comprehend the Place Value in Numbers When Solving Math Word Problems?	✓ Link	82.50		CoT_Eng (self-consistency @ 5)	2023-06-03
An Empirical Study of Data Ability Boundary in LLMs' Math Reasoning	✓ Link	80.6		MMOS-CODE-34B(0-shot)	2024-02-23
An Empirical Study of Data Ability Boundary in LLMs' Math Reasoning	✓ Link	79.3		MMOS-DeepSeekMath-7B(0-shot)	2024-02-23
An Empirical Study of Data Ability Boundary in LLMs' Math Reasoning	✓ Link	76.4		MMOS-CODE-7B(0-shot)	2024-02-23
Llama 2: Open Foundation and Fine-Tuned Chat Models	✓ Link	69.2		LLaMA 2-Chat	2023-07-18
Math Word Problem Solving by Generating Linguistic Variants of Problem Statements	✓ Link	63.5	63.5	DeBERTa	2023-06-24
Large Language Models are Zero-Shot Reasoners	✓ Link	62.1		PaLM (zero-shot, CoT)	2022-05-24
Large Language Models are Zero-Shot Reasoners	✓ Link	58.8		PaLM (zero-shot)	2022-05-24
Frugal LMs Trained to Invoke Symbolic Solvers Achieve Parameter-Efficient Arithmetic Reasoning	✓ Link	56.65		SYRELM (Vicuna 13B)	2023-12-09
ATHENA: Mathematical Reasoning with Thought Expansion	✓ Link	54.8		ATHENA (roberta-large)	2023-11-02
Learning Multi-Step Reasoning by Solving Arithmetic Tasks	✓ Link	48.9		MsAT-DeductReasoner	2023-06-02
Learning to Reason Deductively: Math Word Problem Solving as Complex Relation Extraction	✓ Link	47.3		Roberta-DeductReasoner	2022-03-19
ATHENA: Mathematical Reasoning with Thought Expansion	✓ Link	45.6		ATHENA (roberta-base)	2023-11-02
Are NLP Models really able to Solve Simple Math Word Problems?	✓ Link	43.8	43.8	Graph2Tree with RoBERTa	2021-03-12
Are NLP Models really able to Solve Simple Math Word Problems?	✓ Link	41.0	41.0	GTS with RoBERTa	2021-03-12
Are NLP Models really able to Solve Simple Math Word Problems?	✓ Link	40.3	40.3	LSTM Seq2Seq with RoBERTa	2021-03-12
Frugal LMs Trained to Invoke Symbolic Solvers Achieve Parameter-Efficient Arithmetic Reasoning	✓ Link	40.1		SYRELM (GPT-J)	2023-12-09
Are NLP Models really able to Solve Simple Math Word Problems?	✓ Link	38.9	38.9	Transformer with RoBERTa	2021-03-12
Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Better Solvers for Math Word Problems	✓ Link		94.2	GPT-4 DUP	2024-04-23

OpenCodePapers

math-word-problem-solving-on-svamp