Mathematical Foundations of Private Synthetic Data
Speaker: Roman Vershynin
Speaker: Roman Vershynin
Abstract:Â
A number of companies and researchers are trying to find a way to generate synthetic data privately, so that private information contained in the true data is protected. How should we measure privacy and utility of synthetic data, and what is the fundamental privacy-utility tradeoff? I will describe our efforts on creating differentially private synthetic data that has "universal utility". Working with such data, the user can accurately answer any "global" statistical queries about the true data - even the queries not originally known to the creator of the data. This leads to a host of fascinating mathematical problems in probability and combinatorics. The talk is based on a series of papers joint with March Boedihardjo, Thomas Strohmer, and others.