OpenCodePapers

sentence-ordering-on-econlogicqa

Sentence Ordering
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeAccuracyModelNameReleaseDate
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning✓ Link0.5692GPT-4-Turbo2024-05-13
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning✓ Link0.5538GPT-42024-05-13
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning✓ Link0.3769GPT-3.5-Turbo2024-05-13
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning✓ Link0.3462Llama-3-8B-Instruct2024-05-13
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning✓ Link0.3154Mistral-7B-Instruct-v0.22024-05-13
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning✓ Link0.2615Mistral-7B-v0.12024-05-13
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning✓ Link0.2615Mistral-7B-v0.22024-05-13
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning✓ Link0.2385Llama-3-8B2024-05-13
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning✓ Link0.2308Zephyr-7B-Alpha2024-05-13
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning✓ Link0.2077Yi-6B-Chat2024-05-13
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning✓ Link0.1769Zephyr-7B-Beta2024-05-13
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning✓ Link0.1538Mistral-7B-Instruct-v0.12024-05-13
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning✓ Link0.1462Llama-2-13B-Chat2024-05-13
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning✓ Link0.0923Llama-2-7B-Chat2024-05-13
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning✓ Link0.0846Gemma-2B-IT2024-05-13
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning✓ Link0.0385Yi-6B2024-05-13
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning✓ Link0.0231Gemma-7B-IT2024-05-13
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning✓ Link0.0077Llama-2-7B2024-05-13