OpenCodePapers

lipreading-on-lrw-1000

Natural Language TransductionLipreading

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	Top-1 Accuracy	ModelName	ReleaseDate
SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization	✓ Link	58.2	SyncVSR (Word Boundary)	2024-06-18
Learn an Effective Lip Reading Model without Pains	✓ Link	55.7%	3D-ResNet + Bi-GRU + MixUp + Label Smooth + Cosine LR (Word Boundary)	2020-11-15
Distinguishing Homophenes Using Multi-Head Visual-Audio Memory for Lip Reading	✓ Link	53.8	3D Conv + ResNet-18 + MS-TCN + Multi-Head Visual-Audio Memory	2022-04-04
Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video	✓ Link	50.82%	3D Conv + ResNet-18 + Bi-GRU + Visual-Audio Memory	2022-04-04
Learn an Effective Lip Reading Model without Pains	✓ Link	48.3%	3D-ResNet + Bi-GRU + MixUp + Label Smooth + Cosine LR	2020-11-15
Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition	✓ Link	45.24%	3D Conv + ResNet-18 + Bi-GRU (Face Cutout)	2020-03-06
Deformation Flow Based Two-Stream Network for Lip Reading	✓ Link	41.93%	DFTN	2020-03-12
Mutual Information Maximization for Effective Lip Reading	✓ Link	38.79%	GLMIM	2020-03-13
Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence Lip-Reading		38.7%	PCPG	2020-03-09