5/1/2018 Tue lab meeting
Post date: May 2, 2018 10:34:46 AM
We have discussed what to do for CeSLea (digital companion) project.
Text-dependent speaker recognition
need keyword spotting using wordnet
background model for rejection - research topic
Text-independent speaker recognition
i-vector
ergodic (fully connected) HMM
RNN / LSTM
needs variety of training sentences
Speaker segmentation
finds when the enrolled speakers are talking
finds when there is speaker change (not enrolled)
Speech separation
using extra information as Google did recently on Youtube videos
general source separation
Rejection
likelihood ratio test (log difference test) - difference between 1st and 2nd speakers, or 1st and average scores
normalized Z-test: z = (score-mean)/std > threshold
90% CI 1.645
95% CI 1.960
99% CI 2.576
99.5% CI 2.807