Projects

Exploration in Reinforcement Learning

Joint work with Chen-Yu Wei, Liyu Chen, Haipeng Luo, Hiteshi Sharma, Rahul Jain and Ashutosh Nayyar

Reinforcement learning studies the problem of learning to interact with an unknown environment while maximizing the cumulative reward. The learner faces a fundamental exploration-exploitation trade-off: should he explore the environment to gather information for future decisions or should he exploit the available information to maximize the reward. We have proposed several provable algorithms for efficient exploration. Here are a list of publications on this topic: [C3, C4, C5, C6, C8, P1, P2]

Adversarial Examples in Deep Learning

Joint work with Tasmin Chowdhury, Hsin-Tai Wu and Sayandev Mukherjee

Small adversarial perturbation of images misleads advanced classifiers such as AlexNet, VGG, etc. We proposed Permutation Phase Defense (PPD) as a novel method to resist adversarial attacks. PPD combines random permutation of the image with phase component of its Fourier transform. The basic idea behind this approach is to turn adversarial defense problems analogously into symmetric cryptography, which relies solely on safekeeping of the keys for security. In PPD, safe keeping of the selected permutation ensures effectiveness against adversarial attacks. [C1][code]

Recommender Systems

Joint work with Milad Marvian and Amirhossein Mohajerin Ariaei

Consider a Q&A portal (e.g., stackoverflow, quora) with some registered experts to answer questions. Each expert has a descriptive summary of their skills as well as some tags about their domain of experties. The challenge is to match a given question to the experts who may most likely answer the question. We developed a similarity-based algorithm and deployed the advances in recommender systems algorithms such as ranking-based and popularity-based matrix factorization to achieve the top 10% in Bytecup International Machine Learning Competition.

Scheduling in Healthcare

Joint work with Naumaan Nayyar and Rahul Jain

Scheduling surgeries in operating rooms is challenging due to the uncertain duration of the surgeries. A tight schedule leads to long waiting times for patients while an open schedule causes under utilization of the resources. For an efficient schedule, one needs to determine the sequence of surgeries as well as the appointment times to minimize the operating rooms idle time and service time delay. Theoretically, we proved that there exists no indexing policy (e.g., least variance first) that yields the optimal sequence. However, we developed a heuristic practical algorithm for large hospitals (>20 rooms) that improves the current practice of hospitals by 70%. [J1]