OpenCodePapers
question-answering-on-bamboogle
Question Answering
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
Show papers without code
Paper
Code
Accuracy
↕
ModelName
ReleaseDate
↕
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
76.1
ReST meets ReAct (PaLM 2-L + Google Search)
2023-12-15
Answering Questions by Meta-Reasoning over Multiple Chains of Thought
✓ Link
66.5
MCR (code-davinci-002) + Google Search
2023-04-25
Making Retrieval-Augmented Language Models Robust to Irrelevant Context
✓ Link
62.7
RALM (LLaMA2-13B + Google Search)
2023-10-02
Measuring and Narrowing the Compositionality Gap in Language Models
✓ Link
60.0
Self-ask (GPT-3; davinci-002) + Google Search
2022-10-07
Measuring and Narrowing the Compositionality Gap in Language Models
✓ Link
57.6
Self-ask (GPT-3; davinci-002)
2022-10-07
Measuring and Narrowing the Compositionality Gap in Language Models
✓ Link
46.4
Chain-of-Thought (GPT-3; davinci-002)
2022-10-07
FireAct: Toward Language Agent Fine-tuning
44.0
FireAct
2023-10-09
Measuring and Narrowing the Compositionality Gap in Language Models
✓ Link
17.6
Direct Prompting (GPT-3; davinci-002)
2022-10-07
Measuring and Narrowing the Compositionality Gap in Language Models
✓ Link
0
Google Search
2022-10-07