End-to-End Beam Retrieval for Multi-Hop Question Answering | ✓ Link | 0.775 | 0.727 | 0.850 | 0.663 | 0.901 | 0.505 | Beam Retrieval | 2023-08-17 |
Big Bird: Transformers for Longer Sequences | ✓ Link | 0.736 | | 0.755 | | 0.891 | | BigBird-etc | 2020-07-28 |
Adaptive Information Seeking for Open-Domain Question Answering | ✓ Link | 0.720 | 0.675 | 0.805 | 0.612 | 0.860 | 0.449 | AISO | 2021-09-14 |
Chain-of-Skills: A Configurable Model for Open-domain Question Answering | ✓ Link | 0.717 | 0.674 | 0.801 | 0.613 | 0.853 | 0.457 | Chain-of-Skills | 2023-05-04 |
[]() | | 0.708 | 0.670 | 0.795 | 0.594 | 0.843 | 0.444 | TPRR | |
HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions | | 0.706 | 0.671 | 0.799 | 0.574 | 0.835 | 0.432 | HopRetriever + Sp-search | 2020-12-31 |
[]() | | 0.700 | 0.662 | 0.793 | 0.573 | 0.840 | 0.420 | EBS-Large | |
[]() | | 0.698 | 0.671 | 0.799 | 0.572 | 0.826 | 0.431 | HopRetriever | |
Answering Open-Domain Questions of Varying Reasoning Steps from Text | ✓ Link | 0.696 | 0.663 | 0.791 | 0.569 | 0.832 | 0.428 | IRRR+ | 2020-10-23 |
[]() | | 0.689 | 0.655 | 0.786 | 0.559 | 0.831 | 0.409 | EBS-SH | |
Answering Open-Domain Questions of Varying Reasoning Steps from Text | ✓ Link | 0.686 | 0.657 | 0.782 | 0.559 | 0.821 | 0.421 | IRRR | 2020-10-23 |
[]() | | 0.678 | 0.648 | 0.778 | 0.561 | 0.818 | 0.410 | HopRetriever-V2 | |
[]() | | 0.670 | 0.646 | 0.778 | 0.557 | 0.812 | 0.411 | AFSGraph-retriever | |
Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval | ✓ Link | 0.666 | 0.623 | 0.753 | 0.575 | 0.809 | 0.418 | Recursive Dense Retriever | 2020-09-27 |
[]() | | 0.662 | 0.630 | 0.754 | 0.546 | 0.800 | 0.404 | Step-by-Step Retriever | |
Answering Any-hop Open-domain Questions with Iterative Document Reranking | | 0.639 | 0.625 | 0.759 | 0.510 | 0.789 | 0.360 | DDRQA | 2020-09-16 |
[]() | | 0.639 | 0.608 | 0.739 | 0.531 | 0.793 | 0.380 | HopRetriever-V1 | |
[]() | | 0.630 | 0.620 | 0.753 | 0.499 | 0.778 | 0.354 | DR model large | |
[]() | | 0.629 | 0.617 | 0.746 | 0.500 | 0.772 | 0.368 | Model name | |
[]() | | 0.629 | 0.617 | 0.746 | 0.500 | 0.772 | 0.368 | HopAns | |
[]() | | 0.629 | 0.604 | 0.732 | 0.520 | 0.771 | 0.380 | Anonymous | |
[]() | | 0.624 | 0.615 | 0.746 | 0.503 | 0.772 | 0.362 | Multi-dimensional-AFSGraph | |
[]() | | 0.623 | 0.597 | 0.714 | 0.510 | 0.774 | 0.379 | HGN-albert + SemanticRetrievalMRS IR | |
[]() | | 0.617 | 0.603 | 0.731 | 0.499 | 0.768 | 0.359 | Tree-shaped-cluster | |
[]() | | 0.617 | 0.601 | 0.730 | 0.500 | 0.769 | 0.359 | AFSgraph | |
Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering | ✓ Link | 0.612 | 0.600 | 0.730 | 0.491 | 0.764 | 0.354 | Robustly Fine-tuned Graph-based Recurrent Retriever | 2019-11-24 |
[]() | | 0.609 | 0.601 | 0.730 | 0.485 | 0.759 | 0.350 | AFSgraph model | |
[]() | | 0.607 | 0.579 | 0.699 | 0.510 | 0.768 | 0.372 | HGN-large + SemanticRetrievalMRS IR | |
[]() | | 0.602 | 0.598 | 0.727 | 0.480 | 0.749 | 0.345 | RoBERTa-DenseRetriever-Fast | |
[]() | | 0.602 | 0.598 | 0.727 | 0.480 | 0.749 | 0.345 | DPR-recurrent | |
[]() | | 0.601 | 0.596 | 0.724 | 0.479 | 0.748 | 0.345 | RoBERTa-DenseRetriever | |
Hierarchical Graph Network for Multi-hop Question Answering | ✓ Link | 0.599 | 0.567 | 0.692 | 0.500 | 0.764 | 0.356 | HGN + SemanticRetrievalMRS IR | 2019-11-09 |
Dynamically Fused Graph Network for Multi-hop Reasoning | ✓ Link | 0.5982 | | | | | | DFGN | 2019-05-16 |
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering | ✓ Link | 0.598 | 0.589 | 0.716 | 0.480 | 0.757 | 0.345 | SAFSR model | 2018-09-25 |
[]() | | 0.569 | 0.582 | 0.709 | 0.429 | 0.713 | 0.310 | GraphRR-Fast | |
[]() | | 0.568 | 0.588 | 0.717 | 0.416 | 0.725 | 0.293 | DR model | |
A Simple Yet Strong Pipeline for HotpotQA | | 0.562 | 0.555 | 0.675 | 0.456 | 0.730 | 0.329 | Quark + SemanticRetrievalMRS IR | 2020-04-14 |
[]() | | 0.561 | 0.523 | 0.648 | 0.490 | 0.747 | 0.330 | GAR-BERT | |
[]() | | 0.553 | 0.560 | 0.689 | 0.441 | 0.730 | 0.292 | Graph-based Recurrent Retriever | |
[]() | | 0.548 | 0.529 | 0.648 | 0.428 | 0.720 | 0.312 | MIR+EPS+BERT | |
[]() | | 0.530 | 0.482 | 0.613 | 0.483 | 0.739 | 0.306 | GAR | |
Transformer-XH: Multi-Evidence Reasoning with eXtra Hop Attention | ✓ Link | 0.513 | 0.516 | 0.641 | 0.409 | 0.714 | 0.261 | Transformer-XH-final | 2020-05-01 |
[]() | | 0.496 | 0.490 | 0.608 | 0.417 | 0.700 | 0.271 | Transformer-XH | |
Revealing the Importance of Semantic Retrieval for Machine Reading at Scale | ✓ Link | 0.476 | 0.453 | 0.573 | 0.387 | 0.708 | 0.251 | SemanticRetrievalMRS | 2019-09-17 |
[]() | | 0.429 | 0.421 | 0.517 | 0.371 | 0.598 | 0.247 | DrKIT | |
[]() | | 0.392 | 0.418 | 0.531 | 0.263 | 0.573 | 0.170 | Entity-centric BERT Pipeline | |
[]() | | 0.391 | 0.433 | 0.538 | 0.219 | 0.596 | 0.145 | PR-Bert | |
Answering Complex Open-domain Questions Through Iterative Query Generation | ✓ Link | 0.391 | 0.379 | 0.486 | 0.307 | 0.642 | 0.180 | GoldEn Retriever | 2019-10-15 |
[]() | | 0.370 | 0.394 | 0.514 | 0.242 | 0.585 | 0.133 | SAFSr-Bert | |
Cognitive Graph for Multi-Hop Reading Comprehension at Scale | ✓ Link | 0.349 | 0.371 | 0.489 | 0.228 | 0.577 | 0.124 | Cognitive Graph QA | 2019-05-14 |
[]() | | 0.334 | 0.475 | 0.606 | 0.076 | 0.448 | 0.049 | GAR-NOSF | |
[]() | | 0.304 | 0.358 | 0.453 | 0.160 | 0.512 | 0.115 | IKFGraph | |
[]() | | 0.291 | 0.369 | 0.460 | 0.153 | 0.468 | 0.115 | AnonymousQ | |
[]() | | 0.284 | 0.335 | 0.427 | 0.156 | 0.493 | 0.110 | HGN Model-reproduce | |
Multi-Hop Paragraph Retrieval for Open-Domain Question Answering | ✓ Link | 0.270 | 0.306 | 0.403 | 0.167 | 0.473 | 0.109 | MUPPET | 2019-06-15 |
[]() | | 0.258 | 0.299 | 0.391 | 0.132 | 0.497 | 0.083 | GRN + BERT | |
[]() | | 0.255 | 0.354 | 0.463 | 0.001 | 0.432 | 0.000 | Entity-centric IR | |
Multi-Paragraph Reasoning with Knowledge-enhanced Graph Neural Network | | 0.247 | 0.277 | 0.372 | 0.127 | 0.472 | 0.070 | KGNN | 2019-11-06 |
[]() | | 0.245 | 0.284 | 0.386 | 0.147 | 0.472 | 0.086 | SAQA | |
[]() | | 0.236 | 0.273 | 0.365 | 0.122 | 0.488 | 0.074 | GRN | |
Answering while Summarizing: Multi-task Learning for Multi-hop QA with Evidence Extraction | | 0.231 | 0.287 | 0.381 | 0.142 | 0.444 | 0.087 | QFE | 2019-05-21 |
[]() | | 0.209 | 0.289 | 0.391 | 0.080 | 0.406 | 0.041 | SAFSr_model | |
[]() | | 0.175 | 0.236 | 0.320 | 0.056 | 0.400 | 0.033 | SuppBERT | |
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering | ✓ Link | 0.162 | 0.240 | 0.329 | 0.039 | 0.377 | 0.019 | Baseline Model | 2018-09-25 |
[]() | | 0.011 | 0.074 | 0.121 | 0.000 | 0.078 | 0.000 | tes | |
[]() | | 0.000 | 0.581 | 0.711 | 0.000 | 0.000 | 0.000 | PromptRank-fewshot-2-demo | |
[]() | | 0.000 | 0.581 | 0.710 | 0.000 | 0.000 | 0.000 | graph-recurrent-retriever+roberta-base w. S/R-pretraining | |
[]() | | 0.000 | 0.360 | 0.474 | 0.000 | 0.000 | 0.000 | TPReasoner w/o BERT | |
[]() | | 0.000 | 0.307 | 0.402 | 0.000 | 0.000 | 0.000 | MultiQA | |
Multi-hop Reading Comprehension through Question Decomposition and Rescoring | ✓ Link | 0.000 | 0.300 | 0.407 | 0.000 | 0.000 | 0.000 | DecompRC | 2019-06-07 |
[]() | | 0.000 | 0.300 | 0.407 | 0.000 | 0.000 | 0.000 | | |
[]() | | 0.000 | 0.080 | 0.221 | 0.000 | 0.000 | 0.000 | Mistral multi hop with very large sources | |