OpenCodePapers

question-answering-on-bamboogle

Question Answering
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeAccuracyModelNameReleaseDate
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent76.1ReST meets ReAct (PaLM 2-L + Google Search)2023-12-15
Answering Questions by Meta-Reasoning over Multiple Chains of Thought✓ Link66.5MCR (code-davinci-002) + Google Search2023-04-25
Making Retrieval-Augmented Language Models Robust to Irrelevant Context✓ Link62.7RALM (LLaMA2-13B + Google Search)2023-10-02
Measuring and Narrowing the Compositionality Gap in Language Models✓ Link60.0Self-ask (GPT-3; davinci-002) + Google Search2022-10-07
Measuring and Narrowing the Compositionality Gap in Language Models✓ Link57.6Self-ask (GPT-3; davinci-002)2022-10-07
Measuring and Narrowing the Compositionality Gap in Language Models✓ Link46.4Chain-of-Thought (GPT-3; davinci-002)2022-10-07
FireAct: Toward Language Agent Fine-tuning44.0FireAct2023-10-09
Measuring and Narrowing the Compositionality Gap in Language Models✓ Link17.6Direct Prompting (GPT-3; davinci-002)2022-10-07
Measuring and Narrowing the Compositionality Gap in Language Models✓ Link0Google Search2022-10-07