OpenCodePapers

speech-recognition-on-lrs3-ted

Speech Recognition
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeWord Error Rate (WER)ModelNameReleaseDate
Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation✓ Link0.68Whisper2024-06-14
Large Language Models are Strong Audio-Visual Speech Recognition Learners✓ Link0.81Llama-AVSR2024-09-18
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction✓ Link1.3AV-HuBERT Large2022-01-05
Jointly Learning Visual and Auditory Speech Representations from Raw Data✓ Link1.4RAVEn Large2022-12-12