OpenCodePapers

text-to-speech-synthesis-on-ljspeech

Text-To-Speech Synthesis

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	Audio Quality MOS	Pleasantness MOS	Word Error Rate (WER)	MOS	WER (%)	ModelName	ReleaseDate
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality	✓ Link	4.56					NaturalSpeech	2022-05-09
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality	✓ Link	4.43					VITS	2022-05-09
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech	✓ Link	4.37					Grad-TTS + HiFiGAN (1000 steps)	2021-05-13
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search	✓ Link	4.34					Glow-TTS + HiFiGAN	2020-05-22
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality	✓ Link	4.34					FastSpeech 2 + HiFiGAN	2022-05-09
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech	✓ Link	4.32					FastSpeech 2 + HiFiGAN	2020-06-08
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis	✓ Link	4.28					FastDiff (4 steps)	2022-04-21
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis	✓ Link	4.03					FastDiff-TTS	2022-04-21
Neural Speech Synthesis with Transformer Network	✓ Link	3.88					Transformer TTS (Mel + WaveGlow)	2018-09-19
FastSpeech: Fast, Robust and Controllable Text to Speech	✓ Link	3.84					FastSpeech (Mel + WaveGlow)	2019-05-22
OverFlow: Putting flows on top of neural transducers for better TTS	✓ Link	3.37		2.30			OverFlow	2022-11-13
FastSpeech: Fast, Robust and Controllable Text to Speech	✓ Link	2.4					Merlin	2019-05-22
[]()		1.25					temp
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis	✓ Link		3.665				Flowtron	2020-05-12
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis	✓ Link		3.521				Tacotron 2	2020-05-12
Matcha-TTS: A fast TTS architecture with conditional flow matching	✓ Link				3.84	2.09	Matcha-TTS	2023-09-06