OpenCodePapers

automatic-speech-recognition-on-lrs2

Automatic Speech Recognition (ASR)
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeTest WERModelNameReleaseDate
Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation✓ Link1.3Whisper2024-06-14
Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels✓ Link1.5CTC/Attention2023-03-25
Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition✓ Link2.7MoCo + wav2vec (w/o extLM)2022-02-24
End-to-end Audio-visual Speech Recognition with Conformers✓ Link3.9End2end Conformer2021-02-12
Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition✓ Link6.6Whisper-LLaMA2023-10-10
Audio-visual Recognition of Overlapped speech for the LRS2 dataset6.7LF-MMI TDNN2020-01-06
Audio-Visual Speech Recognition With A Hybrid CTC/Attention Architecture8.2CTC/attention2018-09-28
Deep Audio-Visual Speech Recognition✓ Link9.7TM-seq2seq2018-09-06
Deep Audio-Visual Speech Recognition✓ Link10.1TM-CTC2018-09-06