Paper | Code | Word Error Rate (WER) | ModelName | ReleaseDate |
---|---|---|---|---|
Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels | ✓ Link | 19.1 | CTC/Attention | 2023-03-25 |
Sub-word Level Lip Reading With Visual Attention | 30.7 | VTP with more data | 2021-10-14 | |
Sub-word Level Lip Reading With Visual Attention | 40.6 | VTP | 2021-10-14 |