Research

My research interest is centered around privacy, scalability, security and reliability of distributed systems. The applications may differ but the goal remains the same: study theoretical and fundamental limits of innovative storage and computing systems and design codes achieving those limits. My current work is motivated by federated learning, private and secure distributed computing, distributed storage, DNA-based storage systems and private and secure network coding. A brief description of the topics is given next.

Federated Learning

Tremendous amount of data is generated every minute by a huge collection of electronic devices owned by different users. Federated learning emerged as a natural mechanism to process and analyze the users' data without moving it to a central server. A federator asks the users to run a gradient descent algorithm on their local data and share the gradient with the federator at every iteration. The challenges of federated learning include reducing the exorbitant communication cost, maintaining privacy of the processed data and robustness against users dropping out and malicious users actively trying to corrupt the computation. Those challenges can be tackled by using gradient compression techniques and by designing new secret sharing schemes tailored to this specific problem and understand the fundamental limits that arise to best design the desired schemes.



Private and Secure Coded Computing

A parameter server (referred to as main node or master node) wants to offload intensive computation on sensitive data to cheap commodity nodes (referred to as worker nodes) to parallelize and speed up the computation. Two main challenges arise in this setting: 1) mitigating the effect of slow or unresponsive worker nodes, known as stragglers; and 2) maintaining privacy of the sensitive data while offloading computations to the workers. When the computation is linear or a polynomial function of the input data, those challenges can be tackled by adding redundancy and using tools from coding theory and secret sharing. However, adding redundancy artificially increases the computation load of the workers. Hence the need to understand the limits on the required redundancy for a certain number of stragglers and design schemes that match those limits. We tackle those challenges for several types of computations and for homogeneous and heterogeneous time-varying computing systems.



Distributed Stochastic Gradient Descent

In this setting, a parameter server (referred to as main node or master node) wants to offload the computations required to run a machine learning algorithm on sensitive data to cheap commodity nodes (referred to as worker nodes). The focus of this research direction is to mitigate the effect of stragglers without resorting to the use of redundancy. The main computation of machine learning algorithms is that of evaluating the gradient of a loss function on the input data. Under mild assumptions, stragglers can be simply ignored. The effect of ignoring the stragglers is a slower convergence of the machine learning algorithm. Hence the need to understand the effect of the number of ignored stragglers on the convergence of the algorithm and design mechanisms that gradually decrease the straggler tolerance to reduce the overall runtime. Further, assigning tasks to all workers and ignoring the stragglers incurs extra costs. Thus, it is more efficient to assign tasks only to a subset of workers deemed to be the fastest. We tackle the challenge of learning the speed of the workers while assigning them useful computations and reducing the overhead required for the learning process.



Privacy and Security in Distributed Storage

With cloud storage and large-scale data centers, data storage is becoming distributed by nature. Despite its great benefits, distributed storage is prone to node failures and adversarial attacks. Node failure is now the norm rather than the exception. Traditional error-correcting codes used to recover from node failures incur huge communication overheads. To reduce the communication overhead, regenerating codes are proposed. Those codes provide a tradeoff between the storage overhead (redundancy) and the communication overhead. However, regenerating codes are a dynamic family of codes that can repair nodes failures sequentially. Hence, if care is not taken, a malicious attacker jamming one node in the system can corrupt the whole stored data through what is known as pollution attack. In addition, guaranteeing the strong information-theoretic privacy of the stored data requires the use of secret sharing which are equivalent to traditional error-correcting codes. We study the interplay between the storage overhead, communication overhead and the achievable privacy and security (against malicious nodes) guarantees for private and secure distributed storage schemes. A crucial observation we make here is that in contrast to traditional secret sharing, novel secret sharing schemes can be designed to encode sparse input data into sparse output codewords if the privacy requirement is relaxed. We study this fundamental tradeoff between privacy and sparsity, which reduces the storage overhead of the scheme.



Insertion/Deletion-Correcting Codes

Traditional error-correcting codes such as Reed-Solomon codes tackle the correction of substitution errors that arise in almost all current storage media. Due to the exponential increase of the amount of stored data, researchers and practitioners are aiming for a novel dense and sustainable long-term storage medium. DNA-based storage is arising as a competitive candidate that satisfies the required properties. Many practical and theoretical challenges need to be addressed until DNA-based storage becomes viable. From the theoretical perspective, the errors that arise in DNA-based storage are a mix of insertions, deletions and substitutions. The study of codes correcting insertions and deletions dates back to the 1960s to correct synchronisation errors and has now regained attention due to DNA-based storage. We study the fundamental limits codes able to correct insertions and deletions in d-dimensional arrays and construct such codes. We investigate the correction of insertions and deletions in many particular settings of interest.



Private and Secure Network Coding

Network coding consists of using error-correcting codes for reliable data transmission through communication networks. The errors that arise in this setting can be corrected by shifting from the Hamming metric considered in traditional error-correcting codes and going to the rank metric that considers array codes and analyzes the rank of the error. We consider private and secure network codes that allow reliable transmission of private data in networks corrupted by an attacker that can eavesdrop and jam a subset of the communication links. We leverage the limitation of the adversary to guarantee higher transmission rates through the use of subspace codes.