OpenCodePapers

question-answering-on-hotpotqa

Question Answering
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeJOINT-F1ANS-EMANS-F1SUP-EMSUP-F1JOINT-EMModelNameReleaseDate
End-to-End Beam Retrieval for Multi-Hop Question Answering✓ Link0.7750.7270.8500.6630.9010.505Beam Retrieval2023-08-17
Big Bird: Transformers for Longer Sequences✓ Link0.7360.7550.891BigBird-etc2020-07-28
Adaptive Information Seeking for Open-Domain Question Answering✓ Link0.7200.6750.8050.6120.8600.449AISO2021-09-14
Chain-of-Skills: A Configurable Model for Open-domain Question Answering✓ Link0.7170.6740.8010.6130.8530.457Chain-of-Skills2023-05-04
[]()0.7080.6700.7950.5940.8430.444TPRR
HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions0.7060.6710.7990.5740.8350.432HopRetriever + Sp-search2020-12-31
[]()0.7000.6620.7930.5730.8400.420EBS-Large
[]()0.6980.6710.7990.5720.8260.431HopRetriever
Answering Open-Domain Questions of Varying Reasoning Steps from Text✓ Link0.6960.6630.7910.5690.8320.428IRRR+2020-10-23
[]()0.6890.6550.7860.5590.8310.409EBS-SH
Answering Open-Domain Questions of Varying Reasoning Steps from Text✓ Link0.6860.6570.7820.5590.8210.421IRRR2020-10-23
[]()0.6780.6480.7780.5610.8180.410HopRetriever-V2
[]()0.6700.6460.7780.5570.8120.411AFSGraph-retriever
Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval✓ Link0.6660.6230.7530.5750.8090.418Recursive Dense Retriever2020-09-27
[]()0.6620.6300.7540.5460.8000.404Step-by-Step Retriever
Answering Any-hop Open-domain Questions with Iterative Document Reranking0.6390.6250.7590.5100.7890.360DDRQA2020-09-16
[]()0.6390.6080.7390.5310.7930.380HopRetriever-V1
[]()0.6300.6200.7530.4990.7780.354DR model large
[]()0.6290.6170.7460.5000.7720.368Model name
[]()0.6290.6170.7460.5000.7720.368HopAns
[]()0.6290.6040.7320.5200.7710.380Anonymous
[]()0.6240.6150.7460.5030.7720.362Multi-dimensional-AFSGraph
[]()0.6230.5970.7140.5100.7740.379HGN-albert + SemanticRetrievalMRS IR
[]()0.6170.6030.7310.4990.7680.359Tree-shaped-cluster
[]()0.6170.6010.7300.5000.7690.359AFSgraph
Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering✓ Link0.6120.6000.7300.4910.7640.354Robustly Fine-tuned Graph-based Recurrent Retriever2019-11-24
[]()0.6090.6010.7300.4850.7590.350AFSgraph model
[]()0.6070.5790.6990.5100.7680.372HGN-large + SemanticRetrievalMRS IR
[]()0.6020.5980.7270.4800.7490.345RoBERTa-DenseRetriever-Fast
[]()0.6020.5980.7270.4800.7490.345DPR-recurrent
[]()0.6010.5960.7240.4790.7480.345RoBERTa-DenseRetriever
Hierarchical Graph Network for Multi-hop Question Answering✓ Link0.5990.5670.6920.5000.7640.356HGN + SemanticRetrievalMRS IR2019-11-09
Dynamically Fused Graph Network for Multi-hop Reasoning✓ Link0.5982DFGN2019-05-16
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering✓ Link0.5980.5890.7160.4800.7570.345SAFSR model2018-09-25
[]()0.5690.5820.7090.4290.7130.310GraphRR-Fast
[]()0.5680.5880.7170.4160.7250.293DR model
A Simple Yet Strong Pipeline for HotpotQA0.5620.5550.6750.4560.7300.329Quark + SemanticRetrievalMRS IR2020-04-14
[]()0.5610.5230.6480.4900.7470.330GAR-BERT
[]()0.5530.5600.6890.4410.7300.292Graph-based Recurrent Retriever
[]()0.5480.5290.6480.4280.7200.312MIR+EPS+BERT
[]()0.5300.4820.6130.4830.7390.306GAR
Transformer-XH: Multi-Evidence Reasoning with eXtra Hop Attention✓ Link0.5130.5160.6410.4090.7140.261Transformer-XH-final2020-05-01
[]()0.4960.4900.6080.4170.7000.271Transformer-XH
Revealing the Importance of Semantic Retrieval for Machine Reading at Scale✓ Link0.4760.4530.5730.3870.7080.251SemanticRetrievalMRS2019-09-17
[]()0.4290.4210.5170.3710.5980.247DrKIT
[]()0.3920.4180.5310.2630.5730.170Entity-centric BERT Pipeline
[]()0.3910.4330.5380.2190.5960.145PR-Bert
Answering Complex Open-domain Questions Through Iterative Query Generation✓ Link0.3910.3790.4860.3070.6420.180GoldEn Retriever2019-10-15
[]()0.3700.3940.5140.2420.5850.133SAFSr-Bert
Cognitive Graph for Multi-Hop Reading Comprehension at Scale✓ Link0.3490.3710.4890.2280.5770.124Cognitive Graph QA2019-05-14
[]()0.3340.4750.6060.0760.4480.049GAR-NOSF
[]()0.3040.3580.4530.1600.5120.115IKFGraph
[]()0.2910.3690.4600.1530.4680.115AnonymousQ
[]()0.2840.3350.4270.1560.4930.110HGN Model-reproduce
Multi-Hop Paragraph Retrieval for Open-Domain Question Answering✓ Link0.2700.3060.4030.1670.4730.109MUPPET2019-06-15
[]()0.2580.2990.3910.1320.4970.083GRN + BERT
[]()0.2550.3540.4630.0010.4320.000Entity-centric IR
Multi-Paragraph Reasoning with Knowledge-enhanced Graph Neural Network0.2470.2770.3720.1270.4720.070KGNN2019-11-06
[]()0.2450.2840.3860.1470.4720.086SAQA
[]()0.2360.2730.3650.1220.4880.074GRN
Answering while Summarizing: Multi-task Learning for Multi-hop QA with Evidence Extraction0.2310.2870.3810.1420.4440.087QFE2019-05-21
[]()0.2090.2890.3910.0800.4060.041SAFSr_model
[]()0.1750.2360.3200.0560.4000.033SuppBERT
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering✓ Link0.1620.2400.3290.0390.3770.019Baseline Model2018-09-25
[]()0.0110.0740.1210.0000.0780.000tes
[]()0.0000.5810.7110.0000.0000.000PromptRank-fewshot-2-demo
[]()0.0000.5810.7100.0000.0000.000graph-recurrent-retriever+roberta-base w. S/R-pretraining
[]()0.0000.3600.4740.0000.0000.000TPReasoner w/o BERT
[]()0.0000.3070.4020.0000.0000.000MultiQA
Multi-hop Reading Comprehension through Question Decomposition and Rescoring✓ Link0.0000.3000.4070.0000.0000.000DecompRC2019-06-07
[]()0.0000.3000.4070.0000.0000.000
[]()0.0000.0800.2210.0000.0000.000Mistral multi hop with very large sources