I am currently a Data Science Fellow at UCSD, affiliated with the CSE department, the ITA, and the Qualcomm Institute.
Prior to this, I was a visiting Research Scientist at Facebook Reality Labs, exploring the intersection of Distributed Computing, Social Networks, and Augmented Reality. In June 2018, I graduated with my PhD in Computer Science & Engineering from the University of Washington (UW). My advisor was Paul Beame, and I was a part of both the Theory and MISL groups. I also spent time at Microsoft Research, working with the DNA Storage and the Theory groups. Previously, I obtained a BS from the Department of Computer Science at the University of Illinois Urbana-Champaign (UIUC).
My research explores the algorithmic foundations of fundamental data science problems. Recently, I've been very interested in adversarial robustness, edit distance, trace reconstruction, and explainable AI. Over the past few years, I have also worked on efficient clustering, similarity search, graph algorithms, and parameter estimation. Overall, my goal is to provide deep insight into complex tasks, invoking diverse tools from theoretical computer science, statistics, and discrete mathematics.
For more information, take a look at my Publications. See also the UCSD Machine Learning Blog for expository articles about my papers and more.
I am on the 2020-2021 academic job market! Please contact me if you have questions or offers!
News
(Jan 2021) Happy new year! Our paper on reconstructing trees from traces has been substantially updated with new figures and algorithms, and it has been officially accepted to the Annals of Applied Probability!
(Dec 2020) New paper on Approximate Trace Reconstruction, where we study the complexity of approximately reconstructing a string in edit distance, given a small number of samples from the deletion channel. Check it out for new upper and lower bounds!
(Nov 2020) Fresh paper on a theoretical analysis showing how to print synthetic DNA as efficiently as possible, with some nice upper and lower bounds based on non-trivial concentration inequalities and combinatorial arguments about random walks.
(Nov 2020) My third deep learning paper: showing how to improve linkage-based hierarchical clustering by using variational autoencoders with a mixture model prior!
(Nov 2020) My second deep learning paper concerning a new type of generalization that we call Close Category Generalization. We study a potentially controversial but effective variant of generalization based on outputting the class of the nearest neighbor when faced with out-of-distribution data.
(Oct 2020) New survey with lots of new models and open questions: Trace Reconstruction Problems in Computational Biology, to appear in the Special Issue of IEEE Transactions on Information Theory Dedicated to the Memory of Vladimir I. Levenshtein.
(Aug 2020) Check out my new animated video about vector matrix vector queries! Watch the first minute and you'll be hooked!
(June 2020) Ever wondered what you could learn about matrix M by querying <u,Mv> for different vectors u,v? See our new paper for algorithms and lower bounds!
(May 2020) We started a new UCSD Machine Learning Blog. Stay tuned for many interesting posts this summer!
(April 2020) Two papers accepted to ISIT 2020! The first on making DNA storage more efficient, and the second on covering codes for insertions and deletions (see below).
(March 2020) My first deep learning paper! On the connection between robustness and local smoothness for separated data (e.g., images). Accepted to NeurIPS 2020.
(March 2020) Very excited about our new paper on Explainable k-means and k-medians clustering, taking a stab at explainable AI with provable guarantees!
(March 2020) New paper on Distributed All-Pairs Similarity Search, with real experiments on real data, handling skew (e.g., popular YouTube videos or Tweets) better than previous approaches! Accepted to WWW 2020.
(Nov 2019) Yet another new paper on Covering Codes for Insertions and Deletions, where we provide new upper and lower bounds for an asymmetric covering problem regarding words of different lengths. Accepted to IEEE Transactions on Information Theory.
(Nov 2019) New paper on the Equivalence of Systematic Linear Data Structures and Matrix Rigidity, including new cell probe lower bounds for the vector-matrix-vector problem. Accepted to ITCS 2020.
(Oct 2019) New paper on Edge Isoperimetric Inequalities for Powers of the Hypercube, including a new bound on the Kleitman-West Problem.
(Oct 2019) New paper on Adversarial Examples for kNN and Random Forests. Accepted to AISTATS 2020.
Contact
Atkinson Hall #4111
University of California, San Diego
9500 Gilman Dr, La Jolla, CA 92093