Session X (May 17, 10:30am-12:00pm): Sequential Design, Active Learning, and Bayesian Optimization, organized by Qiong Zhang
Title: Subsampling for Big Rare Events Data Beyond Binary Responses
Speaker: HaiYing Wang, UConn
Abstract: Rare events data occur when certain types of events occur with very small probabilities. Subsampling is very effective to reduce the computational cost of analyzing rare events data with losing significant estimation efficiency. Existing investigations on subsampling with rare events data focus on binary response models. We investigate rare events data beyond binary responses. If sufficient data points for the non-rare observations are sampled, there will be no statistical efficiency loss. In the scenario that there is estimation efficiency loss due to down sampling, we developed optimal sampling design to minimize the information loss.