OpenCodePapers

visual-question-answering-on-vizwiz-2020-vqa

Visual Question Answering (VQA)

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	overall	yes/no	number	other	unanswerable	ModelName	ReleaseDate
PaLI: A Jointly-Scaled Multilingual Language-Image Model	✓ Link	73.3					PaLI	2022-09-14
Less Is More: Linear Layers on CLIP Features as Powerful VizWiz Model		61.64					CLIP-Ensemble	2022-06-10
Less Is More: Linear Layers on CLIP Features as Powerful VizWiz Model		60.66					CLIP-Single	2022-06-10
[]()		56.33	78.89	27.1	42.3	89.49	HSSLab
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization	✓ Link	56.0					Video-LaVIT	2024-02-05
[]()		55.93	73.45	26.83	42.29	88.95	sudoku
[]()		54.76	80.52	27.37	40.92	86.82	Katya
[]()		49.58	59.79	20.6	34.14	88.26	Modified Attention
[]()		48.39	60.65	22.22	34.21	83.43	shaunakh
[]()		44.9	60.08	18.16	28.88	84.13	e50
[]()		44.62	63.8	18.97	28.12	84.32	SKP
[]()		44.01	53.01	17.34	27.34	85.86	knight777
[]()		41.92	49.86	18.7	26.13	81.54	pk
[]()		34.96	60.08	23.04	19.05	71.45	Tartans
[]()		34.13	25.31	14.09	17.57	78.2	VWTest1
[]()		6.25	79.85	2.71	1.21	7.13	BERT-RG