OpenCodePapers

lipreading-on-lrw-1000

Natural Language TransductionLipreading
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeTop-1 AccuracyModelNameReleaseDate
SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization✓ Link58.2SyncVSR (Word Boundary)2024-06-18
Learn an Effective Lip Reading Model without Pains✓ Link55.7%3D-ResNet + Bi-GRU + MixUp + Label Smooth + Cosine LR (Word Boundary)2020-11-15
Distinguishing Homophenes Using Multi-Head Visual-Audio Memory for Lip Reading✓ Link53.83D Conv + ResNet-18 + MS-TCN + Multi-Head Visual-Audio Memory2022-04-04
Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video✓ Link50.82%3D Conv + ResNet-18 + Bi-GRU + Visual-Audio Memory2022-04-04
Learn an Effective Lip Reading Model without Pains✓ Link48.3%3D-ResNet + Bi-GRU + MixUp + Label Smooth + Cosine LR2020-11-15
Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition✓ Link45.24%3D Conv + ResNet-18 + Bi-GRU (Face Cutout)2020-03-06
Deformation Flow Based Two-Stream Network for Lip Reading✓ Link41.93%DFTN2020-03-12
Mutual Information Maximization for Effective Lip Reading✓ Link38.79%GLMIM2020-03-13
Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence Lip-Reading38.7%PCPG2020-03-09