Recent Research

My primary research interests lie in determining the fundamental limits of communication network architectures, designing algorithms that have provably good performance, and instantiating these algorithms and measuring their performance on reconfigurable hardware platforms. I am also interested in the interplay between economics and technology as observed in large-scale network systems composed of many competing agents. I have studied caching in content delivery networks and peer-to-peer networks, scheduling and resource allocation in wireless networks, as well as new game theoretical methods for agent coordination and resource pricing.

I am a recipient of the Defense Threat Reduction Agency Young Investigator Award (2009) and the NSF Career Award (2012), as well as research awards from Cisco (2008) and Google (2010). I also received the Dept. of ECE Outstanding Professor Award (2013) and was selected as a TEES (College of Engineering) Select Young Faculty Fellow (2014) at Texas A&M University.

My Google Scholar profile is available here.

Below, I highlight some research problems that I am currently working on.

Figure 1. Software reconfigurable network

Caching and Content Distribution

More than half the traffic carried on the Internet goes through content distribution networks, and some estimates put it as high as 85%. A major fraction of this traffic is in the form of streaming stored videos. We seek a systematic and principled approach to designing coordination methods in content distribution systems. Our contributions have been in the space of hybrid P2P and CDN approaches, content streaming algorithms, and content caching. Below, we discuss a representative problem that I have worked on recently.

Accuracy vs. Mixing Time of Caching Algorithms

Typical analysis of content caching algorithms using the metric of hit probability under a stationary request process does not account for performance loss under a variable request arrival process. In our work, we consider adaptability of caching algorithms from two perspectives: (a) the accuracy of learning a fixed popularity distribution; and (b) the speed of learning items’ popularity.

We first consider a genie-aided algorithm that has knowledge of the true popularity ranking. If we have a cache of size M, the genie-aided algorithm simply places the top M popular items in the cache to maximize hit probability. We can think of the state of the cache as the vector of items in the cache, denoted by x. We call the fixed (optimal) state engendered by the genie-aided algorithm as c*. We compute the distance between the stationary distributions of several popular algorithms such as LRU, k-LRU, LRU(m), RANDOM, ARC etc., as well as our own algorithm that we call Adaptive LRU (A-LRU), and c*.

We then characterize the mixing time of each algorithm, i.e., the time needed to attain the stationary distribution, which we use as a measure of learning efficiency. We merge both measures above to obtain the “learning error” representing both how quickly and how accurately an algorithm learns the optimal caching distribution as follows:

and t is the total number of requests seen by algorithm A. The first term on the right side measures the distance between the stationary distribution of algorithm A and c*, giving a measure of the accuracy of learning. The second term on the right is the error due to time lag of learning (efficiency), which is related to the mixing time of the algorithm. Our main analytical contribution is to obtain quantitative bounds on the learning error.

Our approach yields insights on the value of different dimensions of cacahing shown in Figure 2. Most conventional caching algorithms, such as LRU, RANDOM and FIFO, have been designed and analyzed on a simple (isolated) cache, as shown in Figure 2(a) (left). New caching algorithms have been proposed that either divide the memory block into two or more levels, with a hierarchical algorithm such as LRU(m) attempting to ensure that more popular content items get cached in the higher levels Figure 2(b) (left), or a meta-cache that simply stores content identities can be used to better learn popularity without wasting memory to cache the actual data item (Figure 2(c) (left)).

Figure 2. Left: Dimensions of Caching. Middle: Learning errors of single-level algorithms. Right: Leaning error of multi-level algorithms.

The learning error of these algorithms as a function of the number of requests received is illustrated in Figure 2 (Middle) for single-level caching algorithms, and Figure 2 (Right) for multi-level caching algorithms, respectively, where the y-axis is shown in a logarithmic scale. We see immediately that FIFO and RANDOM have higher learning error than the other algorithms, regardless of the number of requests, which corresponds to the smallest stationary hit probability. This shows why their performance is poor, no matter how long they are trained. The learning error for LRU decreases fast initially and then levels off, whereas 2-LRU and LRU(m) have a slower decay rate, but the eventual error is lower than that of LRU. This corresponds to faster mixing of LRU but a poorer eventual accuracy (τ-distance) as compared to 2-LRU and LRU(m). ARC has a good performance initially, but it too levels off to an error larger than 2-LRU. CLIMB eventually has good performance, but it has a much slower decay rate. Finally, A-LRU, which uses an architecture shown in Figure 2(d) (left) dynamically adjusts the allocation made to the different levels and attains the lowest learning error with the smallest number of requests.

The full paper is available here.

Smartphones and Wireless Networking

Current wireless edge networks with tightly coupled PHY/MAC that cater to worst or average case performance lack the agility to best serve legions of heterogeneous applications. Simultaneously, software reconfigurable infrastructure has become increasingly mainstream to the point that per-packet and per-flow decisions can be dynamically controlled at multiple layers of the communications stack. Our contribution has been in the space of device-to-device (D2D) networks on the Android platform, auction-based scheduling, and enabling ultra-low latency wireless networks. Below, we discuss an example of work on platforms and implementation that I have worked on recently.

Reconfigurable Wireless Networks

The goal of our work is to design, develop and demonstrate FlowBazaar, an open market-based approach to create a value chain from the end-user of an application on one side, to algorithms operating over reconfigurable infrastructure on the other, so as to enable an ecosystem wherein disparate applications are able to obtain necessary resources for optimal performance.

Our vision for FlowBazaar is to create an agile communications paradigm at this last-hop wireless edge that accounts for the entire chain of value progression illustrated in Figure 3. Per-packet mechanisms employed at the millisecond timescale at the MAC layer result in a certain Quality of Service (QoS) defined as a vector of statistical of connection properties consisting of [throuдhput, delay, jitter, drop rate]. Selection between available mechanisms on a per-flow basis is enabled by using a QoS Policy at a larger timescale (seconds). Hence, the QoS policy determines which flows or packet aggregates receive what available QoS. The end-user perception of a particular QoS depends on the application being used. For instance, the impact of latency and loss rate on YouTube is unlike that of Web browsing. The relation between QoS and perceived performance is the application Quality of Experience (QoE) defined as an application-specific map between the received QoS vector and an element of the set {0, 1, 2, 3, 4, 5}, with 0 indicating the lowest and 5 indicating the highest satisfaction. Yet, end-users might have contrasting priorities for applications, and an application that one user finds important might be irrelevant to another. Thus, there must be a means for articulation of end-user value on a per-application basis.

The FlowBazaar architecture is illustrated in Figure 4, in which we have shown the different elements of the architecture in a color coded manner. The three main units of our system are, (i) an off-the-shelf WIFi access point running the OpenWRT operating system and with custom antennas, (ii) a centralized controller hosted on a Linux workstation, and (iii) multiple wireless stations (Windows/Linux/Android) enabled with our Client Middleware and running four applications – Skype, YouTube, Web Browser, and (large) File Download.

Fig. 3

Per-Packet Mechanisms (blue tiles): At the level of data packets, we develop and utilize two layers of software defined infrastructure, namely (i) custom-made reconfigurable antennas, and (ii) reconfigurable queueing.

QoS Policy (orange tiles): Policy decisions that result in different QoS vectors via mechanism selection are made at a centralized controller that communicates using the OpenFlow protocol. We use a custom set of messages meant for Layer 1 and Layer 2 functionalities (labeled L1 Flow and L2 Flow, respectively).

Figure 4. FlowBazaar Architecture

Application QoE (beige tiles): A smart middleware layer at clients is used to interface with our system such that there is no need for applications (such as YouTube or Skype) to be aware of its existence. Machine learning tools are used o ine to determine the map between QoS and QoE on a per-application basis to create lookup tables.

End-user Value (pink tiles): Clients are offered feasible QoS vectors under an market framework. The decision on which N flows to admit to a high-priority queue is taken via an N + 1th price auction using a local currency (a token allowance), which is conducted every 10 seconds. The client Middleware scan solve either a forward looking value function (MDP) or myopically determine its bid.

The novelty of this design is that it enables a platform for experimentation with custom policies and configurations at the MAC and PHY (electromagnetic) layers, using the sample statistics as feedback to inform reconfiguration decisions. The auction enables us to elicit true value declaration from the middleware, ensuring that the flows selected for prioritization are the ones that truly need higher QoS.

We compared the performance of three policies— no differentiation (vanilla), FlowBazaar with MDP-based auction and greedy maximization of system utility—in the presence of different types of traffic. We only present results pertaining to YouTube on this page in Figure 5. Each group of bars represents the average QoE value which is on the right y-axis, along with the QoS values on the left y-axis. Both the winner of the MDP-based auction and the Greedy approach experience much better QoE than the no differentiation (or vanilla) case. The winners in the MDP-based auction experience far fewer rebuffering events, and QoE is correlated with the fraction of such events.

The full paper is available here.

Figure 5. YouTube performance statistics.

Game Theory, Markets and Multiagent Systems

Game theory is a powerful tool to study coordination problems in large scale distributed systems. The agents in these systems could be either human or algorithms trying to maximize their individual value. Agents often have some common information that gives them information about each other. A certain collective behavior could emerge from gerry decision making, and the question is whether this outcome is preferable from a system perspective. We have been working on several problems of coordination and market design, with recent focus being on the theory and application of mean field games (MFG).

EnergyCoupon: Theory and Practice of a Nudge System for Demand Response in a Smart Electricity Grid

We consider the general problem of resource sharing in societal networks, consisting of interconnected communication, transportation, and energy networks important to the functioning of society. In our example application, we consider a Load Serving Entity (LSE) trying to reduce its exposure to electricity market volatility by incentivizing demand response at the customer level. We assume that the LSE has a good estimate of the statistics of the wholesale price of electricity at different hours of the day, and wishes its customers to move a part of their consumption to times of low estimated prices. The LSE provides incentives via lottery tickets called ``EnergyCoupons'' to each customer based on the time and amount of electricity used. A lottery is held periodically using these coupons.

In our theoretical model, we pose the user decision problem as a mean field game (MFG), and the incentives question as one of trying to select a good mean field equilibrium (MFE). In such a framework, each agent (a participant in the societal network) takes a decision based on an assumed distribution of actions of his/her competitors and the incentives provided by the social planner. The system is said to be at MFE if the agent's action is a sample drawn from the assumed distribution. We show the existence of such an MFE under general settings, and also illustrate how to choose an attractive equilibrium in the context of demand-response in the (smart) electricity network.

We designed and implemented the EnergyCoupon system to provide coupon incentives to users and collect their responses.

The system architecture shown in Figure 6 consists of five parts. This is further divided into three classes of functionalities: data acquisition (blue), algorithms (green) and incentives/lottery (pink), an SQL Database and an Android/iOS App that forms the user interface.

Figure 6. EnergyCoupon Architecture

Data Acquisition: Consists of (i) customer usage data pulled from smart meters at a 15 min resolution, (ii) demand and price data from the Electric Reliability Council of Texas, and (iii) weather data from Weather Underground.

Algorithms: (i) Peak Time Estimation - We use machine learning techniques to predict high price intervals, (ii) Baseline Estimate - We use past usage statistics and weather data to predict each customer's usage in each 30 minute interval, (iii) personalized tips provided to each user.

The experiment was run over residential participants through the summers of 2016 and 2017. The promotional YouTube video that provided users with information on how to participate is shown above.

Figure 7. Behavior of active vs. inactive users.

We found that about a third of the 29 users that signed up were active (they participated in more than 5 lotteries), and such users reduced their peak demand by about 24% as shown in Figure 7.

The full paper containing the theory of mean field games is available here. The paper on the experimental trial is available here.