Separating singing voices from music accompaniment is an important task in many applications, such as music information retrieval, lyric recognition and alignment. Music accompaniment can be assumed to be in a low-rank subspace, because of its repetition structure; on the other hand, singing voices can be regarded as relatively sparse within songs. In this paper, based on this assumption, we propose using robust principal component analysis for singing-voice separation from music accompaniment. Moreover, we examine the separation result by using a binary time-frequency masking method. Evaluations on the MIR-1K dataset show that this method can achieve around 1∼1.4 dB higher GNSDR compared with two state-of-the-art approaches without using prior training or requiring particular features.
This package contains the Matlab codes implementing the RPCA source separation algorithm described in
"Singing-Voice Separation From Monaural Recordings Using Robust Principal Component Analysis," ICASSP 2012.
Our algorithm is composed of the following parts:
a. STFT, masking
b. Robust Principal Component solved by using inexact augmented Lagrange multiplier (ALM) Method
d. BSS Eval toolbox Version 2.1
The algorithm achieves the state-of-the-art performance on MIR-1K Dataset in an unsupervised way.
( Mix signals with different SNR values: http://labrosa.ee.columbia.edu/projects/renoiser/ )
Run rpca_mask_run.m to see how the functions are called.
Separation examples (from MIR-1K Dataset)
Application to lyric synchronization
Deep learning for monaural source separation
Email me if you have any questions.
Return to Po-Sen Huang's Personal Website
Return to SST Software Page