TCS Job Market

2023 TCS Job Market

Theoretical Computer Science (TCS) job candidates, including PhD students expecting to graduate by Sep 1, 2023, current postdocs, and current faculty

Listed Research Areas (use "find" to search by area):

Algorithms

Coding & Information Theory

Complexity

Computational Geometry

Cryptography

Economics & Game Theory

Machine Learning Theory

Parallel & Distributed Computing

Privacy & Security

Quantum Computing

Listed Roles of Interest:

Faculty, Postdoc, Industry research, Industry non-research

PDF File of candidates

Maryam Aliakbarpour

Research Areas: Algorithms; Machine Learning Theory; Privacy & Security

PhD Institution/year: Massachusetts Institute of Technology (MIT) 09/2020

Current Position: Postdoc, Boston University & Northeastern University

Preferred Role(s): Faculty

Keywords: Statistical inference, Property testing, Streaming algorithms

Links: Website CV DBLP Google Scholar

Click Here for Bio & Research Summary

Brief Bio

Maryam Aliakbarpour is a postdoctoral researcher at Boston University and Northeastern University, where she is hosted by Prof. Adam Smith and Prof. Jonathan Ullman. Before that, she was a postdoctoral research associate at the University of Massachusetts Amherst, hosted by Prof. Andrew McGregor. In Fall 2020, she was a visiting participant in the Probability, Geometry, and Computation in High Dimensions Program at the Simons Institute at Berkeley. Maryam received her Ph.D. from MIT, where she was advised by Prof. Ronitt Rubinfeld.

Research Summary

The vast amount of digital data we collect has revolutionized many scientific and industrial fields. Despite our success in harnessing this transformative power of data, computational and societal trends emerging from the current practices of data science (e.g., privacy, data being too large for memory) necessitate upgrading our toolkit for data analysis. My research focuses on designing algorithms with provable guarantees to answer foundational statistical questions about a population from which the data is drawn, while addressing the modern challenges we face. The questions that I work on are fundamental abstractions of real-world applications of data analysis, for example estimating parameters of the data distribution (e.g., mean) and {\em learning} this distribution. Another thread of my work revolves around property testing of data distributions, in which we aim to determine whether a distribution has a certain property or is ``far" from any distribution with such a property. An important example of a statistical property is detecting a correlation in the data, with questions such as ``Is there a significant link between anxiety and social media use?" Within this theme, I have explored the following lines of questions: - Can we analyze sensitive data without compromising the privacy of the individuals in that data? I have developed private testers for several properties of distributions with optimal data usage. - How do restrictions on the available computational resources, such as time and memory, impact our ability to learn or estimate? I have developed an algorithm to estimate entropy of a distribution with limited memory that uses a nearly optimal number of data points from the distribution. - How do we overcome the statistical challenges arising from sequential non-i.i.d. data such as snapshots of an evolving social network? I have developed an algorithm that estimates the parameter the fundamental balls-and-bins randomized process using only a snapshot of the final state of the bins.

Zongchen Chen

Research Areas: Algorithms; Machine Learning Theory

PhD Institution/year: Georgia Institute of Technology 08/2021

Current Position: Postdoc, Massachusetts Institute of Technology

Preferred Role(s): Faculty; Industry research (full-time)

Keywords: MCMC methods; sampling and counting; learning and testing

Links: Website CV DBLP Google Scholar

Click Here for Bio & Research Summary

Brief Bio

I am an instructor (postdoc) in Mathematics at MIT. I am also involved with the Collaboration on the Theoretical Foundations of Deep Learning. Before that, I received my PhD degree in Algorithms, Combinatorics and Optimization (ACO) from the School of Computer Science at Georgia Tech in 2021. I was very fortunate to be advised by Eric Vigoda. I received my BS degree in Mathematics & Applied Mathematics from Zhiyuan Honor College at Shanghai Jiao Tong University in 2016. I have broad interests in randomized algorithms, discrete probability, and machine learning. Currently, my research focuses on Markov chain Monte Carlo (MCMC) methods for sampling from Gibbs distributions, and machine learning problems related to undirected graphical models.

Research Summary

In many scientific settings we often use a statistical model to describe the dependencies among a set of variables. This can be represented by a graph which contains a set of nodes corresponding to the variables of the system and a set of edges connecting pairs of nodes which represent the interrelationship between these variables; these models are known as graphical models. Graphical models arise in a wide variety of scientific fields. More recently, they are extensively applied in machine learning and data science, such as the restricted Boltzmann machine and Bayesian network. One fundamental task in the study of graphical models is to simulate the equilibrium state of the system; more specifically, to sample from the corresponding Gibbs distribution of the model. The Markov chain Monte Carlo (MCMC) method is a standard and popular approach for sampling from the equilibrium distribution. In every step, the algorithm will update the current configuration randomly under appropriate rules, such that the distribution will eventually converge to the desired Gibbs distribution. Sampling is also an important subroutine for many other advanced applications of graphical models such as performing inference or parameter fitting. Despite the wide usage of graphical models and MCMC methods, in practice they are usually applied without any theoretical justification and many fundamental technical questions remain open even for the basic ones. My research goal, on a high level, is to understand the mathematical nature of these probabilistic models, provide strong convergence results for MCMC methods, and apply these structural properties to other problems like parameter learning or model fitting. More specifically, my research focuses on the following topics: 1. Develop universal tools for proving optimal convergence of MCMC; 2. Design efficient sampling algorithms when MCMC fails; 3. Establish connections among sampling, learning, and testing for high-dimensional distributions.

Albert Cheu

Research Areas: Algorithms; Privacy & Security

PhD Institution/year: Northeastern University 5/2021

Current Position: Postdoc, Georgetown University

Preferred Role: Faculty; Postdoc

Keywords: Differential privacy; distributed protocols

Links: Website CV DBLP Google Scholar

Click Here for Bio & Research summary

Brief Bio

Albert grew up in New York City and attended Stuyvesant High School. He earned an undergraduate degree at New York University and a PhD. at Northeastern University, under the supervision of Jonathan Ullman. He spent a semester at the Simons Institute during the Data Privacy program (2019).

Albert has two sisters and two nephews. Because he has no pets, he has enough free time for creative writing. He has some understanding of spoken Spanish and Cantonese.

Research Summary:

I study an important problem in statistics and machine learning (ML): how can we learn about populations without violating the privacy of individuals? For example, a website analyst wants to count a video's views but they should respect the privacy of a user's view history. The analyst could add Gaussian noise to the count and send only the noised count for further processing. The noise masks whether a given user watched the video. This property is differential privacy (DP). Other DP algorithms exist for a great variety of ML tasks.

Although an analyst can claim to run a DP algorithm, users may not trust them. I design protocols where users do not directly hand over their data to the analyst but instead communicate via a shuffler. This service randomizes the order of their messages. In the counting example, each user can send their bit ("did I watch the video?") flipped with small probability. The analyst only observes a noisy & anonymized set of bits; I proved this suffices for DP. This is a *shuffle protocol* and I made others for more complex ML tasks like high-dimensional sums and histograms.

Not all my results are so positive: I have found DP without trusted analysts can incur high costs. Suppose we want to know the most common attribute in a population ("which video is most likely to be shared?"). Any shuffle protocol needs exponentially more samples than a DP algorithm run by a trusted analyst. This translates to lost time and money. I recently showed the same is true for *multi-server* protocols, where users speak directly with two servers and suspect one is corrupt but do not know which.

Looking forward, I want to expand our knowledge of DP without trusted analysts. What can we do with simple communication patterns and practical cryptographic tools like the shuffler? Is there a family of protocols that solves many ML problems with low bandwidth and few samples? How much do users' devices need to compute? I hope to answer these questions with the help of researchers in academia and industry.

William Hoza

Research Areas: Complexity

PhD Institution/year: University of Texas at Austin 08/2021

Current Position: Postdoc, Simons Institute for the Theory of Computing at the University of California, Berkeley

Preferred Role(s): Faculty;Industry research (full-time)

Keywords: pseudorandomness; derandomization

Links: Website DBLP Google Scholar

Click Here for Bio & Research Summary

Brief Bio

I grew up near Olympia, WA. I spent four wonderful years as an undergraduate student at the California Institute of Technology, where I majored in Mathematics and Computer Science. I attended the University of Texas at Austin for graduate school. I had the good fortune to be advised by David Zuckerman; I got my Ph.D. in 2021. Now I'm working in Avishay Tal's group at UC Berkeley as a "Simons-Berkeley Postdoctoral Fellow." My two-year appointment will end in the summer of 2023. I'm happily married and I have a delightful nine-month-old daughter.

Research Summary

I'm interested in the role of randomness in computing. Random bits can be thought of as a scarce computational resource, just like time or space. It is therefore desirable to understand when randomness is necessary for efficient computation and when we can get rid of it without significant penalties. I'm especially interested in *unconditional* derandomization. A powerful approach for conserving random bits is to use a *pseudorandom generator* (PRG). With collaborators, I've designed new, efficient PRGs that provably fool interesting models of computation such as threshold circuits and De Morgan formulas (Hatami, Hoza, Tal, and Tell, FOCS 2021), bounded-depth read-once formulas (Doron, Hatami, and Hoza, CCC 2019 and CCC 2020), and unbounded-width read-once permutation branching programs (Hoza, Pyne, and Vadhan, ITCS 2021). In each case, either the seed length of the PRG is near-optimal, or else significantly improving the seed length would require a breakthrough in circuit lower bounds. Going forward, I plan to continue designing and analyzing unconditional PRGs that fool various important models of computation. One of my main research objectives is to prove that L = BPL, meaning that every decision algorithm can be fully derandomized without increasing its space complexity by more than a constant factor. The L vs. BPL problem is the primary subject of my Ph.D. dissertation. It is the motivation behind much of my PRG research, but I also study more "creative" approaches to proving L = BPL that circumvent the challenge of constructing better PRGs. For example, with collaborators, I have developed new constructions and applications of certain relaxations of PRGs such as *hitting set generators* (Hoza and Zuckerman, SICOMP 2020; Cheng and Hoza, ToC 2022; Bogdanov, Hoza, Prakriya, and Pyne, CCC 2022) and *weighted PRGs* (Hoza, RANDOM 2021). I plan to continue working on L vs. BPL using a broad variety of techniques and approaches. I'm also interested in circuit lower bounds and other areas in complexity theory.

Victor Reis

Research Areas: Algorithms

PhD Institution/year: University of Washington 06/2023

Current Position: Student (graduating), University of Washington

Preferred Role(s): Postdoc;Industry research (full-time);Industry research (postdoc)

Keywords: Discrepancy theory; convex geometry; approximation algorithms

Links: Website CV DBLP Google Scholar

Click Here for Bio & Research Summary

Brief Bio

I'm a final year PhD student in the theory group at the University of Washington and have the honor to be advised by Thomas Rothvoss. My work has mostly focused on discrepancy theory, developing new algorithms to balance vectors in several classes of convex bodies. Before coming to UW, I earned my B.A. in Math and CS from Cornell, where I worked with Bobby Kleinberg and David Williamson. Outside of research, I enjoy organizing programming contests at UW and coaching competitors, having advanced to the ICPC World Finals twice as a contestant (2015, 2016) and three times as a coach (2017, 2018, 2022). I'm from Brazil and enjoy practicing Brazilian jiu-jitsu as a blue belt.

Research Summary

I am broadly interested in algorithmic questions in high-dimensional geometry. My work during my PhD has mostly focused on balancing vectors in convex bodies and its applications. In my first paper (with Thomas Rothvoss) I showed the existence of good spectral partial colorings: given vectors $v_1, \dots, v_m \in \mathbb{R}^n$ so that $\sum_{i=1}^m v_i v_i^\top \preceq I_n$, there is an algorithm to compute $x \in [-1,1]^n$ with a linear number of coordinates in $\pm 1$ satisfying $\|\sum_{i=1}^m x_i v_i v_i^\top\|_{\mathrm{op}} \lesssim \sqrt{\frac{n}{m}}$, which yields a new algorithm for linear size spectral sparsification. I have since (with Thomas Rothvoss) found tight partial colorings for balancing vectors in $\ell_p$ balls in the $\ell_q$ norm for all $p, q$, extending Spencer's theorem (for which $p = q = \infty$). I have made some progress (with Daniel Dadush and Haotian Jiang) towards understanding matrix discrepancy in Schatten norms, generalizing Spencer's theorem to block-diagonal matrices. Very recently, I made progress towards yet another generalization of Spencer's theorem, improving the vector balancing constant for all zonotopes. I have also shown a higher order version of Pisier's inequality for vector-valued functions in the hypercube, using tools from real-rooted polynomials, which led to the first learning algorithm for bounded degree $d$ functions in the hypercube $\{\pm 1\}^n$ with $o(n^d)$ queries. Most of my research in vector balancing has had the theme of showing measure lower bounds for convex bodies, using tools from probability, optimization and convex geometry, such as random walks, covering numbers, mirror descent and log-concavity. Moving forward, I am hopeful to develop new tools to attack some long-standing conjectures in the field, such as zonotope sparsification with a linear number of segments and the matrix Spencer conjecture.

João Ribeiro

Research Areas: Coding & Information Theory; Cryptography

PhD Institution/year: Imperial College London 06/2021

Current Position: Postdoc, Carnegie Mellon University

Preferred Role(s): Faculty; Postdoc; Industry research (full-time); Industry research (postdoc)

Keywords: coding theory; cryptography; pseudorandomness; randomness extraction

Links: Website CV DBLP Google Scholar

Click Here for Bio & Research Summary

Brief Bio

I am currently a Post Doctoral Fellow in the Computer Science Department of Carnegie Mellon University, working with Vipul Goyal and Venkatesan Guruswami. I received my PhD from the Department of Computing of Imperial College London, where I was advised by Mahdi Cheraghchi. During my PhD, I also spent some time at CQT (hosted by Divesh Aggarwal), UIUC (hosted by Olgica Milenkovic), and UMich (hosted by Mahdi Cheraghchi). Before that, I received an MSc in Computer Science from ETH Zurich and a BSc in Applied Mathematics and Computation from Instituto Superior Técnico - University of Lisbon.

Research Summary

My research focuses on coding theory and pseudorandomness along with the application of tools from these fields to tackle problems in cryptography and reliable data storage. I have studied several problems in cryptography where codes and other pseudorandom objects, such as randomness extractors, play a key role. These include leakage-resilient and non-malleable cryptography, where we aim to design cryptographic schemes resilient against physical attacks which allow an adversary to learn noisy information about private components (such as secret keys) or to tamper with their values. My co-authors and I have designed improved schemes with such properties for important cryptographic tasks, such as secret sharing and privacy amplification. Besides using randomness extractors as important tools in my research, I have also explored various facets of randomness extraction itself. These include designing improved extractors for multiple independent weak sources of randomness with additional properties relevant for cryptographic applications, such as non-malleability, improved extractors for low-complexity sources, and distributed algorithms for generating shared randomness among many mutually distrusting parties in various adversarial settings -- an important resource in cryptography. Motivated by recent developments in DNA-based data storage systems, I have studied reliable coding against errors causing loss of synchronization between sender and receiver, such as deletions and insertions. Most standard approaches, which are tailored towards bit-flips and erasures, fail when applied to such errors. Therefore, correcting deletions and insertions requires combining standard coding theory with new insights.

Vikrant Singhal

Research Areas: Algorithms;Machine Learning Theory;Privacy & Security

PhD Institution/year: Northeastern University 08/2021

Current Position: Postdoc, University of Waterloo

Preferred Role(s): Faculty;Postdoc;Industry research (full-time);Industry research (postdoc)

Keywords: Differential privacy; machine learning; learning theory; robustness; statistics; algorithms

Links: Website DBLP Google Scholar

Click Here for Bio & Research Summary

Brief Bio

I am a postdoc at University of Waterloo, supervised by Gautam Kamath. My research broadly lies within theoretical computer science, although my primary focus has been towards privacy preserving data analysis, in particular, differential privacy (DP) [DMNS06]. My work has involved determining both upper and lower bounds for many important, fundamental problems in private statistical estimation. (1) To develop private learning algorithms, I have used numerous tools from areas like statistics and learning theory. (2) For hardness results, I have contributed towards strengthening of popular existing ideas in DP literature, and have solved multiple open problems on proving lower bounds in private estimation, which also directly implied the optimality of many of the upper bounds that I showed.

Research Summary

To summarise my contributions in the field of private statistical estimation, I have worked on fundamental problems like mean estimation of various families of distributions [KSU20, KLSU19], covariance estimation of high-dimensional Gaussians [KLSU19, KMSSU22], subspace estimation (a sub-problem of Principal Component Analysis or PCA) under distributional assumptions [SS21], and parameter estimation of mixtures of Gaussians [KSSU19]. These problems are very fundamental in the world of statistics and learning, but had not been systematically explored under privacy constraints. Besides exploring the above in the regular DP setting, I also investigated the role of public data in private estimation problems, for example, estimation of high-dimensional Gaussians and mixtures of Gaussians [BKS22]. Additionally, I worked on strengthening known techniques for proving lower bounds in DP, and using them to prove new results for folklore problems [KMS22]. I wish to continue this line of work, in general, to make further advancements in private statistics and learning, whilst exploring new areas, like user-level DP. I would also want to explore other practical problems in DP, which could help in making existing work more implementable, or could be directly deployed for machine learning purposes.

Nicole Wein

Research Areas: Algorithms

PhD Institution/year: MIT 09/2021

Current Position: Postdoc, DIMACS, Rutgers University

Preferred Role(s): Faculty

Keywords: graph algorithms; fine-grained complexity; dynamic algorithms; parameterized algorithms

Links: Website DBLP Google Scholar

Click Here for Bio & Research Summary

Brief Bio

I am a Simons Postdoctoral Leader at DIMACS at Rutgers University working with Sepehr Assadi, Aaron Bernstein, and Martin Farach-Colton. I graduated with my PhD in 2021 from MIT, where I was advised by Virginia Vassilevska Williams. I also have a Masters in Computer Science from Stanford University and a B.S. in Computer Science/Math from Harvey Mudd College.

Research Summary

One of my central research goals is to understand classical graph problems in the context of modern datasets with large-scale or evolving data. To illustrate my research on large-scale datasets, consider the following example. Given a large graph, we would simply like to compute the single largest distance in the graph (the diameter). Interestingly, the best-known way to compute the diameter of a graph is to perform a brute-force computation of every distance in the graph. Such computation can be prohibitively slow for massive graphs. Instead, we would like to perform very fast computation (ideally in linear time), while still gaining useful information. To do so, we may need to settle for an approximate answer instead of an exact answer. One example of my work is establishing optimal time vs. accuracy trade-offs for diameter and related problems (conditioned on well-established hypotheses in theoretical computer science). More generally, my work asks the following question: Question 1. What graph problems can be approximated very efficiently? My research in the context of evolving data focuses on dynamic algorithms where the goal is to maintain the solution to a problem as the underlying data changes. The trivial solution to any dynamic problem is to rerun the algorithm from scratch every time the data changes. Again, this could be prohibitively expensive especially for large datasets. Thus, the central question in the area of dynamic algorithms is: Question 2. Can we solve problems on evolving data without rerunning the algorithm from scratch after every change? I approach both of the above questions from the standpoint of both possibility and impossibility. That is, I develop both algorithms and (conditional or structural) lower bounds.

Lydia Zakynthinou

Research Areas: Machine Learning Theory;Privacy & Security

PhD Institution/year: Northeastern University 07/2023

Current Position: Student (graduating), Northeastern University

Preferred Role(s): Postdoc;Industry research (full-time);Industry research (postdoc)

Keywords: differential privacy; generalization; robustness; algorithmic stability

Links: Website CV DBLP Google Scholar

Click Here for Bio & Research Summary

Brief Bio

I am a PhD student at Khoury College of Computer Science at Northeastern University, advised by Huy Nguyen and Jonathan Ullman. I have received the Electrical and Computer Engineering diploma from the National Technical University of Athens (2015) and the MS on Logic, Algorithms, and Theory of Computation from the National Kapodistrian University of Athens (2017). During my studies in Greece, I was advised by Dimitris Fotakis. During summers 2019 and 2020, I interned at IBM Research, working with Thomas Steinke on the generalization properties of learning algorithms, and at Apple Research during summer 2022, working with Audra McMillan on the shuffle model of differential privacy. My research has been supported by a Facebook Fellowship (2020-2022) and by a Khoury PhD Research Award.

Research Summary

My research lies in the theoretical foundations of machine learning and data privacy and is focused on two threads. Differentially private data analysis: Differential privacy (DP) is a rigorous mathematical notion that has been widely adopted as the standard criterion for individuals’ privacy by academia, industry, and government. Thus-far, my collaborators and I have designed DP algorithms for fundamental tasks which are sample-efficient under assumptions on the data distribution. Such examples are testing and learning the mean of a Gaussian distribution and learning a large-margin halfspace classifier. My drive in this area is the pursuit of general principles that are broadly applicable, as they form the basis of a more widespread understanding of private statistics. I am particularly interested in establishing formal connections between DP and robust statistics. The connection between these different types of algorithmic stability is an opportunity to import results from one field to another, but I also find it exciting to study on a conceptual level. Generalization: I study conditions under which learning algorithms generalize well, which ensures that their performance on the training sample and on unseen data is close. There is a variety of diverse reasons why an algorithm might generalize well. Thomas Steinke and I proposed a new stability notion which unifies a seemingly-incomparable set of these conditions. My approach in this area has been inspired by DP, new techniques in which I believe will continue to offer advances to the study of generalization in parallel. I am interested in determining the generalization gap of practical learning algorithms but also establishing a deeper understanding of the connections between this diverse set of stability notions (with a framework that incorporates a wider variety of them), which will help us gain better insight on good practices for algorithm design and inform the right choice of algorithm for a given learning task.