Singing-Voice Separation From Monaural Recordings 
Using Robust Principal Component Analysis (ICASSP 2012)


Separating singing voices from music accompaniment is an important task in many applications, such as music information retrieval, lyric recognition and alignment. Music accompaniment can be assumed to be in a low-rank subspace, because of its repetition structure; on the other hand, singing voices can be regarded as relatively sparse within songs. In this paper, based on this assumption, we propose using robust principal component analysis for singing-voice separation from music accompaniment. Moreover, we examine the separation result by using a binary time-frequency masking method. Evaluations on the MIR-1K dataset show that this method can achieve around 1∼1.4 dB higher GNSDR compared with two state-of-the-art approaches without using prior training or requiring particular features.


This package contains the Matlab codes implementing the RPCA source separation algorithm described in 

"Singing-Voice Separation From Monaural Recordings Using Robust Principal Component Analysis," ICASSP 2012.

Our algorithm is composed of the following parts:

a. STFT, masking

b. Robust Principal Component solved by using inexact augmented Lagrange multiplier (ALM) Method



d. BSS Eval toolbox Version 2.1


The algorithm achieves the state-of-the-art performance on MIR-1K Dataset in an unsupervised way.


Mix signals with different SNR values: )

Run rpca_mask_run.m to see how the functions are called. 

Separation examples (from MIR-1K Dataset) 

Application to lyric synchronization


Matlab codes


Related work

Deep learning for monaural source separation


Email me if you have any questions.

Return to Po-Sen Huang's Personal Website

Return to SST Software Page