chart-question-answering-on-chartqa

Chart Question Answering

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	1:1 Accuracy	ModelName	ReleaseDate
Chart-based Reasoning: Transferring Capabilities from LLMs to VLMs		81.3	ChartPaLI-5B + PaLM 2-S	2024-03-19
Gemini: A Family of Highly Capable Multimodal Models	✓ Link	80.8	Gemini Ultra	2023-12-19
DePlot: One-shot visual language reasoning by plot-to-table translation	✓ Link	79.3	DePlot+FlanPaLM+Codex (PoT Self-Consistency)	2022-12-20
Chart-based Reasoning: Transferring Capabilities from LLMs to VLMs		77.3	ChartPaLI-5B	2024-03-19
DePlot: One-shot visual language reasoning by plot-to-table translation	✓ Link	76.7	DePlot+Codex (PoT Self-Consistency)	2022-12-20
ScreenAI: A Vision-Language Model for UI and Infographics Understanding	✓ Link	76.7	ScreenAI 5B (4.62 B params, w/ OCR)	2024-02-07
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts		74.6	SMoLA-PaLI-X Specialist Model	2023-12-01
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts		73.8	SMoLA-PaLI-X Generalist Model	2023-12-01
Synthesize Step-by-Step: Tools Templates and LLMs as Data Generators for Reasoning-Based Chart VQA		72.64	MatCha4096 + LaMenDa	2024-01-01
PaLI-X: On Scaling up a Multilingual Vision and Language Model	✓ Link	72.3	PaLI-X (Single-task FT w/ OCR)	2023-05-29
PaLI-X: On Scaling up a Multilingual Vision and Language Model	✓ Link	70.9	PaLI-X (Single-task FT)	2023-05-29
PaLI-X: On Scaling up a Multilingual Vision and Language Model	✓ Link	70.6	PaLI-X (Multi-task FT)	2023-05-29
DePlot: One-shot visual language reasoning by plot-to-table translation	✓ Link	70.5	DePlot+FlanPaLM (Self-Consistency)	2022-12-20
PaLI-3 Vision Language Models: Smaller, Faster, Stronger	✓ Link	70	PaLI-3	2023-10-13
PaLI-3 Vision Language Models: Smaller, Faster, Stronger	✓ Link	69.5	PaLI-3 (w/ OCR)	2023-10-13
DePlot: One-shot visual language reasoning by plot-to-table translation	✓ Link	67.3	DePlot+FlanPaLM (CoT)	2022-12-20
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond	✓ Link	66.3	Qwen-VL-Chat	2023-08-24
UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning	✓ Link	66.24	UniChart	2023-05-24
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond	✓ Link	65.7	Qwen-VL	2023-08-24
StructChart: On the Schema, Metric, and Augmentation for Visual Chart Understanding	✓ Link	65.3	StructChart+GPT3.5 (STR ChartQA+SimChart9K)	2023-09-20
MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering	✓ Link	64.2	MatCha	2022-12-19
StructChart: On the Schema, Metric, and Augmentation for Visual Chart Understanding	✓ Link	60.7	StructChart+GPT3.5 (STR)	2023-09-20
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding	✓ Link	58.6	Pix2Struct-large	2022-10-07
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding	✓ Link	56.0	Pix2Struct-base	2022-10-07
ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning	✓ Link	45.5	VisionTapas-OCR	2022-03-19
DePlot: One-shot visual language reasoning by plot-to-table translation	✓ Link	42.3	DePlot+GPT3 (Self-Consistency)	2022-12-20
DePlot: One-shot visual language reasoning by plot-to-table translation	✓ Link	36.9	DePlot+GPT3 (CoT)	2022-12-20

OpenCodePapers

chart-question-answering-on-chartqa