OpenCodePapers

logical-reasoning-on-lingoly

Logical Reasoning
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeDelta_NoContextExact Match AccuracyModelNameReleaseDate
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages✓ Link28.8%46.3%Claude Opus2024-06-10
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages✓ Link25.1%37.6%GPT-4o2024-06-10
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages✓ Link23.4%32.1%Gemini 1.5 Pro2024-06-10
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages✓ Link21.5%33.4%GPT-42024-06-10
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages✓ Link11.6%21.5%Command R+2024-06-10
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages✓ Link11.2%21.2%GPT-3.52024-06-10
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages✓ Link6.4%14.2%Mixtral 8x7B2024-06-10
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages✓ Link4.9%11.4%Llama 3 8B2024-06-10
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages✓ Link2.9%10.3%Llama 3 70B2024-06-10
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages✓ Link2.2%4.9%Gemma 7B2024-06-10
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages✓ Link1.1%6.4%Llama 2 70B2024-06-10