Paper | Code | Test WER | ModelName | ReleaseDate |
---|---|---|---|---|
Step-Audio 2 Technical Report | 5.95 | Step-Audio 2 | 2025-08-27 | |
Step-Audio 2 Technical Report | ✓ Link | 6.76 | Step-Audio 2 mini | 2025-08-27 |
NeMo: a toolkit for building AI applications using Neural Modules | ✓ Link | 6.99% | canary-1b-flash | 2025-03-07 |
Step-Audio 2 Technical Report | ✓ Link | 7.83 | Kimi-Audio | 2025-04-25 |
NeMo: a toolkit for building AI applications using Neural Modules | ✓ Link | 7.97% | canary-1b | 2024-02-08 |
NeMo: a toolkit for building AI applications using Neural Modules | ✓ Link | 8.0% | ConformerCTC-L | 2019-09-14 |
Step-Audio 2 Technical Report | 8.33 | Qwen Omni | 2025-08-27 | |
Scribosermo: Fast Speech-to-Text models for German and other Languages | ✓ Link | 9.06% | ConformerCTC-L (5-gram) | 2021-10-15 |
Step-Audio 2 Technical Report | 9.20 | Doubao LLM ASR | 2025-08-27 | |
Step-Audio 2 Technical Report | 9.30 | GPT-4o Transcribe | 2025-08-27 | |
Scribosermo: Fast Speech-to-Text models for German and other Languages | ✓ Link | 14.38% | ConformerCTC-L (5-gram, charbased) | 2021-10-15 |