OpenCodePapers

speech-separation-on-wsj0-2mix

Speech Separation
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeSI-SDRiSDRiNumber of parameters (M)MACs (G)ModelNameReleaseDate
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement✓ Link25.125.222.5TF-Locoformer (L) + DM2024-08-06
Separate and Reconstruct: Asymmetric Encoder-Decoder for Speech Separation✓ Link25.125.259.4155.5SepReformer-L2024-06-10
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement✓ Link24.624.715.0TF-Locoformer (M) + DM2024-08-06
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement✓ Link24.224.322.5TF-Locoformer (L)2024-08-06
MossFormer2: Combining Transformer and RNN-Free Recurrent Network for Enhanced Time-Domain Monaural Speech Separation✓ Link24.155.7MossFormer2 (L)2023-12-19
Boosting Unknown-number Speaker Separation with Transformer Decoder-based Attractor24.0SepTDA (L=12)2024-01-23
Separate And Diffuse: Using a Pretrained Diffusion Model for Improving Source Separation23.9Separate And Diffuse2023-01-25
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement✓ Link23.623.815.0TF-Locoformer (M)2024-08-06
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement✓ Link22.8235.0TF-Locoformer (S) + DM2024-08-06
MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-Head Transformer with Convolution-Augmented Joint Self-Attentions✓ Link22.842.186.1MossFormer (L) + DM2023-02-23
SepMamba: State-space models for speaker separation using Mamba✓ Link22.722.9SepMamba + DM (M)2024-10-28
SPGM: Prioritizing Local Features for enhanced speech separation performance✓ Link22.726.277SPGM + DM2023-09-22
MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-Head Transformer with Convolution-Augmented Joint Self-Attentions✓ Link22.5MossFormer (M) + DM2023-02-23
SepIt: Approaching a Single Channel Speech Separation Bound22.4SepIt2022-05-24
Attention is All You Need in Speech Separation✓ Link22.322.4SepFormer2020-10-25
Wavesplit: End-to-End Speech Separation by Speaker Clustering22.222.3Wavesplit v22020-02-20
SPGM: Prioritizing Local Features for enhanced speech separation performance✓ Link22.126.277SPGM2023-09-22
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement✓ Link2222.15.0TF-Locoformer (S)2024-08-06
Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training✓ Link21.321.5DPTNet (Libri1Mix speech enhancement pre-trained)2020-10-29
SepMamba: State-space models for speaker separation using Mamba✓ Link21.221.4SepMamba + DM (S)2024-10-28
On Time Domain Conformer Models for Monaural Speech Separation in Noisy Reverberant Acoustic Environments✓ Link21.2TD-Conformer (XL) + DM2023-10-09
Sandglasset: A Light Multi-Granularity Self-attentive Network For Time-Domain Speech Separation✓ Link21.0Sandglasset2021-03-01
Effective Low-Cost Time-Domain Audio Separation Using Globally Attentive Locally Recurrent Networks✓ Link20.3GALR2021-01-13
Dual-Path Transformer Network: Direct Context-Aware Modeling for End-to-End Monaural Speech Separation✓ Link20.2DPTNet2020-07-28
Voice Separation with an Unknown Number of Multiple Speakers✓ Link20.12Gated DualPathRNN2020-02-29
Compute and memory efficient universal sound source separation✓ Link19.5Sudo rm -rf (U=36)2021-03-03
Wavesplit: End-to-End Speech Separation by Speaker Clustering19.0Wavesplit v12020-02-20
Sudo rm -rf: Efficient Networks for Universal Audio Source Separation✓ Link18.9Sudo rm -rf XL2020-07-14
Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation✓ Link18.8Dual-path RNN2019-10-14
Divide and Conquer: A Deep CASA Approach to Talker-independent Monaural Speaker Separation✓ Link17.7DeepCASA2019-04-25
Interrupted and cascaded permutation invariant training for speech separation✓ Link17.5IAC-PIT Tasnet2019-10-28
Deformable Temporal Convolutional Networks for Monaural Noisy Reverberant Speech Separation✓ Link17.217.43.63.7Deformable TCN + Dynamic Mixing2022-10-27
Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering✓ Link16.6Hybrid-Tasnet2019-04-16
Deformable Temporal Convolutional Networks for Monaural Noisy Reverberant Speech Separation✓ Link16.116.31.33.7Deformable TCN + Shared Weights + Dynamic Mixing2022-10-27
Two-Step Sound Source Separation: Training on Learned Latent Targets✓ Link16.1Two-step Conv-TasNet2019-10-22
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation✓ Link15.315.65.1Conv-TasNet2018-09-20
Real-time Single-channel Dereverberation and Separation with Time-domainAudio Separation Network✓ Link13.2TasNet v22018-09-02
Alternative Objective Functions for Deep Clustering✓ Link11.5Chimera++2018-04-01
TasNet: time-domain audio separation network for real-time, single-channel speech separation✓ Link10.8TasNet2017-11-01
Deep clustering: Discriminative embeddings for segmentation and separation✓ Link10.8Deep Clustering ++2015-08-18