The Communication Complexity of Set Intersection and Multiple Equality Testing

In this paper we explore fundamental problems in randomized communication complexity such as computing Set Intersection on sets of size k and Equality Testing between vectors of length k. Sağlam and Tardos [FOCS 2013] and Brody et al. [Algorithmica 2016] showed that for these types of problems, one can achieve optimal communication volume of O(k) bits, with a randomized protocol that takes O(log^* k) rounds. They also proved that this is one point along the optimal round-communication trade-off curve.

Aside from rounds and communication volume, there is a third parameter of interest, namely the error probability p_{err}, which we write 2^{-E}. It is straightforward to show that protocols for Set Intersection or Equality Testing need to send at least Ω(k+E) bits, regardless of the number of rounds. Is it possible to simultaneously achieve optimality in all three parameters, namely O(k+E) communication and O(log^* k) rounds?

In this paper we prove that there is no universally optimal algorithm, and we complement the existing round-communication trade-offs [FOCS 2013, Algorithmica 2016] with a new trade-off between rounds, communication, and probability of error. In particular, any protocol for solving Multiple Equality Testing in r rounds with failure probability p_{err}=2^{-E} has communication volume Ω(Ek^{1/r}). We present several algorithms for Multiple Equality Testing (and its variants) that match or nearly match our lower bound and the lower bound of [FOCS 2013, Algorithmica 2016]. Lower bounds on Equality Testing extend to Set Intersection, for every r, k, and p_{err} (which is trivial); in the reverse direction, we prove that upper bounds on Equality Testing for r, k, p_{err} imply similar upper bounds on Set Intersection with parameters r+1, k, and p_{err}.

Our original motivation for considering p_{err} as an independent parameter came from the problem of enumerating triangles in distributed (CONGEST) networks having maximum degree Δ. We prove that this problem can be solved in O(Δ / log n + log log Δ) time with high probability 1-1/poly(n). This beats the trivial (deterministic) O(Δ)-time algorithm and is superior to the Õ(n^{1/3}) algorithm of [SODA 2019, PODC 2019] when Δ=Õ(n^{1/3}).