Series on adversarially robust streaming algorithms

Mar 16, 2023: Vladimir Braverman (Rice University)
"Adversarial Robustness of Streaming Algorithms through Importance Sampling"

Abstract: Robustness against adversarial attacks has recently been at the forefront of algorithmic design for machine learning tasks. In the adversarial streaming model, an adversary gives an algorithm a sequence of adaptively chosen updates u1, ... ,un as a data stream. The goal of the algorithm is to compute or approximate some predetermined function for every prefix of the adversarial stream, but the adversary may generate future updates based on previous outputs of the algorithm. In particular, the adversary may gradually learn the random bits internally used by an algorithm to manipulate dependencies in the input. This is especially problematic as many important problems in the streaming model require randomized algorithms, as they are known to not admit any deterministic algorithms that use sublinear space. In this paper, we introduce adversarially robust streaming algorithms for central machine learning and algorithmic tasks, such as regression and clustering, as well as their more general counterparts, subspace embedding, low-rank approximation, and coreset construction. For regression and other numerical linear algebra related tasks, we consider the row arrival streaming model. Our results are based on a simple, but powerful, observation that many importance sampling-based algorithms give rise to adversarial robustness which is in contrast to sketching based algorithms, which are very prevalent in the streaming literature but suffer from adversarial attacks. In addition, we show that the well-known merge and reduce paradigm in streaming is adversarially robust. Since the merge and reduce paradigm allows coreset constructions in the streaming setting, we thus obtain robust algorithms for k-means, k-median, k-center, Bregman clustering, projective clustering, principal component analysis (PCA) and non-negative matrix factorization. To the best of our knowledge, these are the first adversarially robust results for these problems yet require no new algorithmic implementations. Finally, we empirically confirm the robustness of our algorithms on various adversarial attacks and demonstrate that by contrast, some common existing algorithms are not robust.

This is a joint work with Avinatan Hassidim, Yossi Matias, Mariano Schain, Sandeep Silwal, Samson Zhou. 

This result has appeared in NeurIPS 2021.

Feb 09, 2023: Amit Chakrabarti (Dartmouth)
"How to color your adversary's graph"

Abstract: An n-vertex graph with maximum degree D is (D+1)-colorable: an almost trivial combinatorial result, with an equally simple greedy algorithm to produce a (D+1)-coloring. However, given a stream of edges of such a graph, can we maintain a valid (D+1)-coloring as the edges arrive, while using not much more than O(n) space? What if the edges are chosen by an adversary who can look at our current coloring and add additional edges to try to confound us? This is the newly-popular setting of adversarially robust streaming algorithms and this talk is about the coloring problem in this setting.


We obtain upper and lower bound results for this problem. In O(n polylog n) space, we can maintain an O(D^(5/2))-coloring of such an adversarial graph. On the other hand, every adversarially robust coloring algorithm under the same space limitation must spend Omega(D^2) colors. We in fact prove more general. results that trade off the space usage against the color budget.  One interesting by-product of our work is that in combination with the celebrated Assadi-Chen-Khanna algorithm (SODA 2019), it provides the first separation between randomized and deterministic algorithms for the (ordinary, non-robust) streaming graph coloring problem.


Based on joint works [C.-Ghosh-Stoeckl] and [Assadi-C.-Ghosh-Stoeckl].

Dec 14, 2022: Omri Ben-Eliezer (MIT)
"Robust sampling and online learning"

Abstract: Random sampling is a fundamental primitive in modern algorithms, statistics and machine learning, used as a generic method to obtain a small yet “representative” subset of the data. In this talk we shall explore to what extent random sampling is robust to adaptive inputs in a streaming setting, proposing and studying a model for sampling against a white-box adaptive adversary (where future inputs generated by the adversary are allowed to depend on the current internal state of the sampler). I will demonstrate scenarios where the adaptive sample complexity can be much larger than in the static world, but argue that these scenarios are not entirely realistic. I will then reveal a deep connection between our setting and online learning. As it turns out, the sample complexity in our sampling setting is captured by a sequential version of Rademacher complexity, a notion that is also known to capture the regret in online classification problems. Leveraging this connection, one can bound the sample complexity using the Littlestone dimension of the relevant concept class. The obtained bounds are tight in many regimes, and lead us to resolve a classical open problem on optimal regret bounds in online learning.

Based on joint works with Noga Alon, Yuval Dagan, Shay Moran, Moni Naor, and Eylon Yogev.

Omri-robust-sampling.pdf

Nov 12, 2022: Edith Cohen (Google Research / Tel-Aviv University)
"On Robustness to Adaptive Inputs: A Case Study of CountSketch"

Abstract: The performance of randomized algorithms can be considered both in adaptive settings, where inputs may depend on prior outputs of the algorithm, and in non-adaptive settings. Adaptive settings arise when the algorithm is a component of a system with feedback or when an adversary attempts to construct an input on which the algorithm fails. The last decade saw impressive progress that included the development of black-box methods of obtaining algorithms that are robust to adaptive inputs from non-robust counterparts, most notably using differential privacy, and established lower bounds through "attacks." But intriguing questions remain on the practical implications of adaptivity and potential to develop tailored algorithms for specific problems and "nicer" inputs. In my talk I will review key ideas from the literature and then present our recent work that explored these questions for CountSketch, a popular dimensionality reduction method (joint with Xin Lyu, Jelani Nelson, Tamas Sarlos, Moshe Shechner, and Uri Stemmer).

Oct 05, 2022: David Woodruff (CMU)
"Adversarially Robust Streaming Algorithms"

Abstract: A streaming algorithm is given a sequence of items and seeks to compute or approximate some function of this sequence using a small amount of memory. A body of work has been developed over the last two decades, resulting in optimal streaming algorithms for a number of problems. I will start by surveying some of these problems. I will then investigate the adversarial robustness of streaming algorithms. An algorithm is considered robust if its performance guarantees hold even if the stream is chosen adaptively by an adversary that observes the outputs of the algorithm along the stream and can react in an online manner. While deterministic streaming algorithms are inherently robust, many central problems do not admit sublinear-space deterministic algorithms; on the other hand, space-efficient randomized algorithms for these problems are generally not adversarially robust. This raises the question of whether there exist efficient adversarially robust (randomized) streaming algorithms for these problems. I will survey work showing for a number of streaming problems, one can obtain algorithms that are adversarially robust with a small overhead in their memory requirements.

adversarySurvey.pdf

See more talks from the FDS Virtual Talk series...