OpenCodePapers

multiple-choice-question-answering-mcqa-on-21

Question AnsweringMultiple Choice Question Answering (MCQA)
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeTest Set (Acc-%)Dev Set (Acc-%)ModelNameReleaseDate
Towards Expert-Level Medical Question Answering with Large Language Models✓ Link0.723Med-PaLM 2 (ER)2023-05-16
Towards Expert-Level Medical Question Answering with Large Language Models✓ Link0.715Med-PaLM 2 (CoT+SC)2023-05-16
Towards Expert-Level Medical Question Answering with Large Language Models✓ Link0.713Med-PaLM 2 (5-shot)2023-05-16
Variational Open-Domain Question Answering✓ Link0.6290.583VOD (BioLinkBERT)2022-09-23
Can large language models reason about medical questions?✓ Link0.6270.597Codex 5-shot CoT2022-07-17
BioMedGPT: Open Multimodal Generative Pre-trained Transformer for BioMedicine✓ Link0.514BioMedGPT-10B2023-08-18
MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering✓ Link0.410.40PubmedBERT(Gu et al., 2022)2022-03-27
MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering✓ Link0.390.39SciBERT (Beltagy et al., 2019)2022-03-27
MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering✓ Link0.370.38BioBERT (Lee et al.,2020)2022-03-27
MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering✓ Link0.330.35BERT (Devlin et al., 2019)-Base2022-03-27
MEDITRON-70B: Scaling Medical Pretraining for Large Language Models✓ Link66.0Meditron-70B (CoT + SC)2023-11-27
Large Language Models Encode Clinical Knowledge✓ Link0.576Flan-PaLM (540B, SC)2022-12-26
Large Language Models Encode Clinical Knowledge✓ Link0.565Flan-PaLM (540B, Few-shot)2022-12-26
Large Language Models Encode Clinical Knowledge✓ Link0.545PaLM (540B, Few-shot)2022-12-26
Large Language Models Encode Clinical Knowledge✓ Link0.536Flan-PaLM (540B, CoT)2022-12-26
Galactica: A Large Language Model for Science✓ Link0.529GAL 120B (zero-shot)2022-11-16
Large Language Models Encode Clinical Knowledge✓ Link0.462Flan-PaLM (62B, Few-shot)2022-12-26
Large Language Models Encode Clinical Knowledge✓ Link0.434PaLM (62B, Few-shot)2022-12-26
Large Language Models Encode Clinical Knowledge✓ Link0.345Flan-PaLM (8B, Few-shot)2022-12-26
Galactica: A Large Language Model for Science✓ Link0.325BLOOM (few-shot, k=5)2022-11-16
Galactica: A Large Language Model for Science✓ Link0.296OPT (few-shot, k=5)2022-11-16
Large Language Models Encode Clinical Knowledge✓ Link0.267PaLM (8B, Few-shot)2022-12-26