We propose a method to cluster data in the presence of missing feature values for each of the data points.
We extend the sum-of-norms clustering technique to account for missing entries using a non-convex l0 fusion penalty. We obtain theoretical guarantees for successful clustering of the data using the above technique. We relax the above optimization problem using an H1 penalty , and solve it efficiently with an IRLS approach.
The proposed scheme can successfully recover the original clustering for simulated and real datasets with a large fraction of missing feature values.
S. Poddar, M. Jacob. (To be submitted)