[]() | | 86.3 | Qwen2.5-72B | |
[]() | | 86.1 | Jiutian-大模型 | |
[]() | | 85.9 | LLama-3-405B | |
[]() | | 84.07 | Jiutian-57B | |
[]() | | 82.4 | Qwen2-72B | |
[]() | | 81.0 | LLama-3-70B | |
Scaling Instruction-Finetuned Language Models | ✓ Link | 78.4 | Flan-PaLM 540B (3-shot, fine-tuned, CoT + SC) | 2022-10-20 |
Scaling Instruction-Finetuned Language Models | ✓ Link | 78.2 | PaLM 540B (CoT + self-consistency) | 2022-10-20 |
Evaluating Large Language Models Trained on Code | ✓ Link | 73.5 | code-davinci-002 175B (CoT) | 2021-07-07 |
Scaling Instruction-Finetuned Language Models | ✓ Link | 72.4 | Flan-PaLM 540B (3-shot, fine-tuned, CoT) | 2022-10-20 |
Scaling Instruction-Finetuned Language Models | ✓ Link | 71.2 | PaLM 540B (CoT) | 2022-10-20 |
Scaling Instruction-Finetuned Language Models | ✓ Link | 70.0 | Flan-PaLM 540B (5-shot, finetuned) | 2022-10-20 |
Scaling Instruction-Finetuned Language Models | ✓ Link | 62.7 | PaLM 540B | 2022-10-20 |
Orca 2: Teaching Small Language Models How to Reason | | 50.18 | Orca 2-13B | 2023-11-18 |
Orca 2: Teaching Small Language Models How to Reason | | 45.93 | Orca 2-7B | 2023-11-18 |