OpenCodePapers

speech-recognition-on-switchboard-hub500

Speech Recognition
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodePercentage errorModelNameReleaseDate
On the limit of English conversational speech recognition4.3IBM (LSTM+Conformer encoder-decoder)2021-05-03
Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard4.7IBM (LSTM encoder-decoder)2020-01-20
English Conversational Telephone Speech Recognition by Humans and Machines5.5ResNet + BiLSTMs acoustic model2017-03-06
Achieving Human Parity in Conversational Speech Recognition5.8Microsoft 2016b2016-10-17
The Microsoft 2016 Conversational Speech Recognition System6.2Microsoft 20162016-09-12
The Microsoft 2016 Conversational Speech Recognition System6.3VGG/Resnet/LACE/BiLSTM acoustic model trained on SWB+Fisher+CH, N-gram + RNNLM language model trained on Switchboard+Fisher+Gigaword+Broadcast2016-09-12
The IBM 2016 English Conversational Telephone Speech Recognition System6.6RNN + VGG + LSTM acoustic model trained on SWB+Fisher+CH, N-gram + "model M" + NNLM language model2016-04-27
Achieving Human Parity in Conversational Speech Recognition6.6CNN-LSTM2016-10-17
The IBM 2016 English Conversational Telephone Speech Recognition System6.9IBM 20162016-04-27
The Microsoft 2016 Conversational Speech Recognition System6.9RNNLM2016-09-12
The IBM 2015 English Conversational Telephone Speech Recognition System8.0IBM 20152015-05-21
[]()8.5HMM-BLSTM trained with MMI + data augmentation (speed) + iVectors + 3 regularizations + Fisher
[]()9.2HMM-TDNN trained with MMI + data augmentation (speed) + iVectors + 3 regularizations + Fisher (10% / 15.1% respectively trained on SWBD only)
[]()10.4CNN on MFSC/fbanks + 1 non-conv layer for FMLLR/I-Vectors concatenated in a DNN
[]()11HMM-TDNN + iVectors
[]()11.5CNN
Very Deep Multilingual Convolutional Neural Networks for LVCSR12.2Deep CNN (10 conv, 4 FC layers), multi-scale feature maps2015-09-29
[]()12.6HMM-DNN +sMBR
[]()12.6DNN sMBR
Deep Speech: Scaling up end-to-end speech recognition✓ Link12.6Deep Speech + FSH2014-12-17
Deep Speech: Scaling up end-to-end speech recognition✓ Link12.6CNN + Bi-RNN + CTC (speech to letters), 25.9% WER if trainedonlyon SWB2014-12-17
[]()12.9DNN MMI
[]()12.9DNN MPE
[]()12.9DNN BMMI
[]()12.9HMM-TDNN + pNorm + speed up/down speech
Building DNN Acoustic Models for Large Vocabulary Speech Recognition✓ Link15DNN + Dropout2014-06-30
Building DNN Acoustic Models for Large Vocabulary Speech Recognition✓ Link16DNN2014-06-30
[]()16.1CD-DNN
[]()18.5DNN-HMM
Deep Speech: Scaling up end-to-end speech recognition✓ Link20Deep Speech2014-12-17