Paper | Code | Accuracy | ModelName | ReleaseDate |
---|---|---|---|---|
Lila: A Unified Benchmark for Mathematical Reasoning | ✓ Link | 0.586 | Codex (Few-Shot, 175B) | 2022-10-31 |
Lila: A Unified Benchmark for Mathematical Reasoning | ✓ Link | 0.448 | Bhāskara-P (Fine-tuned, 2.7B) | 2022-10-31 |
Lila: A Unified Benchmark for Mathematical Reasoning | ✓ Link | 0.384 | GPT-3 (Few-Shot, 175B) | 2022-10-31 |
Lila: A Unified Benchmark for Mathematical Reasoning | ✓ Link | 0.268 | Bhāskara-A (Fine-tuned, 2.7B) | 2022-10-31 |
Lila: A Unified Benchmark for Mathematical Reasoning | ✓ Link | 0.238 | Neo-P (Fine-tuned, 2.7B) | 2022-10-31 |
Lila: A Unified Benchmark for Mathematical Reasoning | ✓ Link | 0.177 | Neo-A (Fine-tuned, 2.7B) | 2022-10-31 |