OpenCodePapers

question-answering-on-medqa-usmle

Question Answering
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeAccuracyModelNameReleaseDate
Capabilities of Gemini Models in Medicine91.1Med-Gemini2024-04-29
Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine✓ Link90.2GPT-42023-11-28
Towards Expert-Level Medical Question Answering with Large Language Models✓ Link85.4Med-PaLM 22023-05-16
Towards Expert-Level Medical Question Answering with Large Language Models✓ Link83.7Med-PaLM 2 (CoT + SC)2023-05-16
Towards Expert-Level Medical Question Answering with Large Language Models✓ Link79.7Med-PaLM 2 (5-shot)2023-05-16
MedMobile: A mobile-sized language model with expert-level clinical capabilities✓ Link75.7MedMobile (3.8B)2024-10-11
Small Language Models Learn Enhanced Reasoning Skills from Medical Textbooks74.3Meerkat-7B2024-03-30
Small Language Models Learn Enhanced Reasoning Skills from Medical Textbooks70.6Meerkat-7B (Single)2024-03-30
MEDITRON-70B: Scaling Medical Pretraining for Large Language Models✓ Link70.2Meditron-70B (CoT + SC)2023-11-27
Large Language Models Encode Clinical Knowledge✓ Link67.6Flan-PaLM (540 B)2022-12-26
MEDITRON-70B: Scaling Medical Pretraining for Large Language Models✓ Link61.5LLAMA-2 (70B SC CoT)2023-11-27
SHAKTI: A 2.5 Billion Parameter Small Language Model Optimized for Edge AI and Low-Resource Environments60.3Shakti-LLM (2.5B)2024-10-15
Can large language models reason about medical questions?✓ Link60.2Codex 5-shot CoT2022-07-17
MEDITRON-70B: Scaling Medical Pretraining for Large Language Models✓ Link59.2LLAMA-2 (70B)2023-11-27
Variational Open-Domain Question Answering✓ Link55.0VOD (BioLinkBERT)2022-09-23
BioMedGPT: Open Multimodal Generative Pre-trained Transformer for BioMedicine✓ Link50.4BioMedGPT-10B2023-08-18
Large Language Models Encode Clinical Knowledge✓ Link50.3PubMedGPT (2.7 B)2022-12-26
Deep Bidirectional Language-Knowledge Graph Pretraining✓ Link47.5DRAGON + BioLinkBERT2022-10-17
Large Language Models Encode Clinical Knowledge✓ Link45.1BioLinkBERT (340 M)2022-12-26
Galactica: A Large Language Model for Science✓ Link44.4GAL 120B (zero-shot)2022-11-16
LinkBERT: Pretraining Language Models with Document Links✓ Link40.0BioLinkBERT (base)2022-03-29
GrapeQA: GRaph Augmentation and Pruning to Enhance Question-Answering39.51GrapeQA: PEGA2023-03-22
BioBERT: a pre-trained biomedical language representation model for biomedical text mining✓ Link36.7BioBERT (large)2019-01-25
BioBERT: a pre-trained biomedical language representation model for biomedical text mining✓ Link34.1BioBERT (base)2019-01-25
Large Language Models Encode Clinical Knowledge✓ Link33.3GPT-Neo (2.7 B)2022-12-26
Galactica: A Large Language Model for Science✓ Link23.3BLOOM (few-shot, k=5)2022-11-16
Galactica: A Large Language Model for Science✓ Link22.8OPT (few-shot, k=5)2022-11-16