OpenCodePapers

on-gpqa

Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeAccuracyModelNameReleaseDate
[]()76.01NVIDIA Llama Nemotron Ultra v1
[]()72.3Openai-o1-preview
[]()65Claude3.5-Sonnet
Search-o1: Agentic Search-Enhanced Large Reasoning Models✓ Link63.6Search-o12025-01-09
TextGrad: Automatic "Differentiation" via Text✓ Link55GPT4o+TextGrad2024-06-11
TextGrad: Automatic "Differentiation" via Text✓ Link53.6GPT4o2024-06-11
Qwen2.5 Technical Report✓ Link49Qwen2.5-72B-Instruct2024-12-19