Indian Institute of Technology Bombay
Date: July 27, 2021
We consider a model of dynamic choice over items identified with the nodes of a graph, with positive reinforcement modeled by a transition probability proportional to an α-homogeneous (α > 0) positive function of cumulative reward for the item, but under graphical constraints that restrict the choice at the next instant to the neighbors of the current node. The corresponding empirical process can be written as a stochastic approximation with Markov noise and has a dynamics similar to that of the empirical process of a vertex-reinforced random walk. We analyze its asymptotic behavior using a limiting differential equation, both for fixed α and as α 🠑 ∞ slowly, in what we dub the `annealed dynamics'. In particular, we show convergence to a local, resp., global maximum of a certain `potential'. (Joint with Konstantin Avrachenkov (INRIA Sophia Antipolis), Sharayu Moharir (IIT Bombay) and Suhail Mohmad Shah (Hong Kong Uni.\ of Science and Technology))
Columbia University
Date: July 19, 2021
We consider a novel formulation of the dynamic pricing and demand learning problem, where the evolution of demand in response to posted prices is governed by a stochastic variant of the popular Bass model with parameters (α, β) that are linked to the so-called "innovation" and "imitation" effects. Unlike the more commonly used i.i.d. demand models, in this model the price posted not only affects the demand and the revenue in the current round but also the evolution of demand, and hence the fraction of market potential that can be captured, in future rounds. Finding a revenue-maximizing dynamic pricing policy in this model is non-trivial even when model parameters are known, and requires solving for the optimal non-stationary policy of a continuous-time, continuous-state MDP. In this paper, we consider the problem of dynamic pricing is used in conjunction with learning the model parameters, with the objective of optimizing the cumulative revenues over a given selling horizon. Our main contribution is an algorithm with a regret guarantee of O (m^2/3), where m is mnemonic for the (known) market size. Moreover, we show that no algorithm can incur smaller order of loss by deriving a matching lower bound. We observe that in this problem the market size m, and not the time horizon T, is the fundamental driver of the complexity; our lower bound in fact indicates that for any fixed α,β, most non-trivial instances of the problem have constant T and large m. This insight sets the problem setting considered here uniquely apart from the MAB type formulations typically considered in the learning to price literature.
Stanford University
Date: Aug 16, 2021
We propose a new diffusion-asymptotic analysis for sequentially randomized experiments, including those that arise in solving multi-armed bandit problems. In an experiment with n time steps, we let the mean reward gaps between actions scale to the order 1/\sqrt{n} so as to preserve the difficulty of the learning task as n grows. In this regime, we show that the behavior of a class of sequentially randomized Markov experiments converges to a diffusion limit, given as the solution of a stochastic differential equation. The diffusion limit thus enables us to derive a refined, instance-specific characterization of the stochastic dynamics of adaptive experiments. As an application of this framework, we use the diffusion limit to obtain several new insights on the regret and belief evolution of Thompson sampling. We show that a version of Thompson sampling with an asymptotically uninformative prior variance achieves nearly-optimal instance-specific regret scaling when the reward gaps are relatively large. We also demonstrate that, in this regime, the posterior beliefs underlying Thompson sampling are highly unstable over time.
Cornell University
Date: Aug 23, 2021
Driverless vehicles promise a host of societal benefits including dramatically improved safety, increased accessibility, greater productivity, and higher quality of life. As this new technology approaches widespread deployment, both industry and government are making provisions for teleoperations systems, in which remote human agents provide assistance to driverless vehicles. This assistance can involve real-time remote operation and even ahead-of-time input via human-in-the-loop artificial intelligence systems. In this paper, we address the problem of staffing such a remote support center. Our analysis focuses on the tradeoffs between the total number of remote agents, the reliability of the remote support system, and the resulting safety of the driverless vehicles. By establishing a novel connection between queues with large batch arrivals and storage processes, we determine the probability of the system exceeding its service capacity. This connection drives our staffing methodology. We also develop a numerical method to compute the exact staffing level needed to achieve various performance measures. Our overall staffing analysis may be of use in other applications that combine human expertise and automated systems.
University of North Carolina at Chapel Hill
Date: Aug 30, 2021
We consider the supermarket model in the usual Markovian setting where jobs arrive at a rate that scales with the size of the system. An arriving job joins the shortest among d randomly selected service queues. The goal is to study fluctuations of the state process about its near equilibrium state in the critical regime, when the number of selections increase with system size. We characterize the different types of possible behavior depending on the manner in which the system approaches criticality and the number of selections increase with system size. In particular, we identify three canonical regimes in which fluctuations of the state process about its near equilibrium are of the order square-root of the system size and are governed asymptotically by a one-dimensional Brownian motion. The forms of the limit processes in the three regimes are quite different; in the first case we get a linear diffusion; in the second case we get a diffusion with an exponential drift; and in the third case we obtain a reflected diffusion in a half space. This is joint work with Shankar Bhamidi and Miheer Dewaskar.
New York University
Date: Sep 13, 2021
Previous research has shown that early discharge of patients may hurt their medical outcomes. However, in many cases, the "optimal" length of stay (LOS) and the best location for treatment of the patient are not obvious. A case in point is hematology patients, for whom this is a critical decision. These patients are hospitalized on a regular basis for chemotherapy treatments and it is debated whether following these treatments the patients should stay at the hospital for an observation period or be sent home instead. Patients with hematological malignancies are susceptible to life-threatening infections after chemotherapy. Hence, LOS optimization for hematology patients must balance the risks of patient infection and mortality. The former is reduced by minimizing hospital stay, while the latter is reduced by maximizing hospital stay, whereby infections can be identified and treated earlier. We develop a Markov decision process formulation to explore the impact of the infection and mortality risks on the optimal LOS from a single-patient perspective. We further consider the social optimization problem in which capacity constraints limit the ability of hospitals to keep patients for the entirety of their optimal LOS. We find that the optimal policy under this constraint takes the form of a two-threshold policy. This policy may block some patients and immediately route them to home care, or speed up some patients' LOS and send them to home care early after an observation period in the hospital. Joint work with Prof. Galit Yom-Tov from the Technion.
Institut de Mathématiques de Marseille
Date: Sep 20, 2021
This is a joint work with Agnes Backhausz et Balasz Szegedy. We define a natural notion of micro-state entropy associated to a random process on a unimodular random tree. This entropy is closely related to Bowen's sofic entropy in dynamical systems. It allows to compute the asymptotic free energy of factor models on random graphs and it gives a variational framework to solve many combinatorial optimization problems on random graphs. We give a formula for this entropy for a large class of processes.
Carnegie Mellon University
Date: Sep 27, 2021
Multiserver jobs, which are jobs that occupy multiple servers simultaneously during service, are prevalent in today's computing clusters. However, little is known about the delay performance of systems with multiserver jobs. In this talk, to capture the large scale of modern computing clusters, we consider the scheduling problem of multiserver jobs in systems where the total number of servers becomes large. The multi-server job model opens up new scaling regimes where both the number of servers that a job needs and the system load scale with the total number of servers. Within these scaling regimes, we first characterize when the queueing probability diminishes as the system becomes large, and then turn our focus to the mean waiting time, i.e., the time a job spends in queueing rather than in service. In particular, we derive order-wise sharp bounds on the mean waiting time under various policies. The sharpness of the bounds allows us to establish the asymptotic optimality of a priority policy that we call P-Priority, and the strict suboptimality of the commonly used First-Come-First-Serve (FCFS) policy. This talk is based on joint works with Yige Hong, Qiaomin Xie, and Mor Harchol-Balter.
INRIA
Date: Oct 04, 2021
In neuroscience, learning and memory are usually associated with long-term changes of neuronal connectivity. Synaptic plasticity refers to the set of mechanisms driving the dynamics of neuronal connections, called synapses and represented by a scalar value, the synaptic weight. Spike-Timing Dependent Plasticity (STDP) is a biologically-based model representing the time evolution of the synaptic weight as a functional of the past spiking activity of adjacent neurons.
In this talk we present a new, general, mathematical framework to study synaptic plasticity associated with different STDP rules. The system composed of two neurons connected by a single synapse is investigated and a stochastic process describing its dynamical behavior is presented and analyzed. We show that a large number of STDP rules from neuroscience and physics can be represented by this formalism. Several aspects of these models are discussed and compared to canonical models of computational neuroscience. An important sub-class of plasticity kernels with a Markovian formulation is also defined and investigated via averaging principles. Joint work with Gaëtan Vignoud.
Columbia Business School
Date: Oct 11, 2021
Motivated by a variety of online matching markets, we consider demand and supply which arise i.i.d. uniformly in [0,1]^d, and need to be matched with each other while minimizing the expected average distance between matched pairs (the "cost"). We characterize the achievable cost in three models as a function of the dimension d and the amount of excess supply (M or m): (i) Static matching of N demand units with N+M supply units. (ii) A semi-dynamic model where N+M supply units are present beforehand and N demand units arrive sequentially and must be matched immediately. (iii) A fully dynamic model where there are always m supply units present in the system, one supply and one demand unit arrive in each period, and the demand unit must be matched immediately. We show that cost nearly as small as the distance to the nearest neighbor is achievable in all cases *except* models (i) and (ii) for d=1 and M = o(N). Moreover, the latter is the only case in models (i) and (ii) where excess supply significantly reduces the achievable cost.
Tata Institute of Fundamental Research
Date: Oct 18, 2021
In this talk we consider two classic and popular problems in the stochastic multi armed bandit settings. 1) Regret minimization, where given a finite set of unknown reward distributions (or arms) that can be sequentially sampled, the aim is to sample in a manner that maximizes the expected reward, or equivalently, minimizes the expected regret, over a large sampling horizon. 2) Best arm identification, where in a similar sequential setting the aim is to select the arm with the largest mean using a minimum number of samples on average while keeping the probability of false selection to a pre-specified and small value. Both these problems with a myriad of variations have been well studied in the literature. The analysis techniques for the two problems are typically different and typically the arm distributions are restricted to a small class such as a single parameter exponential family or distributions that are bounded with known bounds. In practice, such restrictions often do not hold and the arm distributions may even be heavy-tailed. In this talk we discuss how to optimally solve both the above problems when minimal restrictions are imposed on the arm distributions. Further, we highlight the commonalities of techniques in solving the two problems.
Amherst College
Date: Nov 8, 2021
Intelligent dispatching is crucial to obtaining low response times in large-scale systems. The bulk of "power-of-d" policies studied in the literature assume that the system is homogeneous, meaning that all servers have the same speed; meanwhile, real-world systems often exhibit server speed heterogeneity. We introduce a general framework for describing and analyzing heterogeneity-aware power-of-d policies. The key idea behind our framework is that dispatching policies can make use of server speed information at two decision points: when choosing which d servers to query, and when assigning a job to one of those servers. Our framework explicitly separates the dispatching policy into a querying rule and an assignment rule; we consider general families of both rule types. In this talk, we will focus on heterogeneity-aware assignment rules that ignore queue length information beyond idleness status. In this setting, we analyze mean response time and formulate novel optimization problems for the joint optimization of querying and assignment. We build upon our optimized policies to develop heuristic queue length-aware dispatching policies. We will also discuss extensions, our ongoing work, and open problems. Based on joint work with Jazeem Abdul Jaleel, Sherwin Doroudi, and Alexander Wickeham.
University of Washington
Date: Nov 15, 2021
Consider a scenario where we are training a model or performing statistical analyses on a shared dataset with entries collected from several contributing individuals. Differential privacy provides protection against membership inference attacks that try to reveal sensitive information in the dataset. Robust estimators provide protection against data poisoning attacks where malicious contributors inject corrupted data. Even though both types of attacks are powerful and easy to launch in practice, there is no algorithm providing protection against both simultaneously. In the first half of this talk, I will present the first efficient algorithm that guarantees both differential privacy and robustness to the corruption of a fraction of the data. I will focus on the canonical problem of mean estimation, which is a critical building block in many algorithms including stochastic gradient descent for training deep neural networks. In the second half of this talk, I will present a new framework that bridges differential privacy and robust statistics, which we call High-dimensional Propose-Test-Release (HPTR). This is a computationally intractable approach, but universally applicable to several statistical estimation problems including mean estimation, linear regression, covariance estimation, and principal component analysis. In most of these cases, HPTR achieves a near-optimal sample complexity by exploiting robust statistics in the algorithm, thus characterizing the minimax error rate of the corresponding private estimation problems for the first time. This talk is based on two recent papers (https://arxiv.org/abs/2104.11315 and https://arxiv.org/abs/2102.09159) and an unpublished ongoing work.
Georgia Institute of Technology
Date: Nov 22, 2021
In this talk, we will discuss some recent progress on the mean-field analysis of structurally constrained large-scale stochastic systems in the context of load balancing in parallel-server systems. The state-of-the-art load balancing heuristics are predominantly based on full-flexibility models in which any task can be assigned to any server. The classical mean-field approximation has been proven to be highly accurate in this case. However, such systems are infeasible due to their overwhelming implementation complexity and prohibitive storage capacity requirement at the servers due to data locality. Thus, one typically needs to design “sparse” systems where each server can process only a small subset of the possible task types, and the load balancing algorithm is restricted to assign an incoming task to one of the servers “compatible” to its type.
Empirically, performance of popular load balancing algorithms on systems with limited flexibility can be observed to degrade arbitrarily. A rigorous analysis goes beyond the scope of state-of-the-art mean-field techniques. This is mainly because the individual queue length processes become non-exchangeable and the system lacks a Markovian characterization that is an aggregate of a large number of individual processes. In this talk, we will describe how coupling techniques can be exploited to identify a broad class of sparse systems, characterized in terms of a collection of sparse compatibility graphs, that enjoy the performance benefits of a fully flexible system asymptotically in the large-system limit and for which the mean-field approximation remains valid. Thus, such a system design helps to drastically reduce the storage capacity requirement and system complexity without sacrificing the delay performance. Both process-level and steady-state asymptotics will be discussed.
California Institute of Technology
Date: Nov 29, 2021
Many inverse problems arising in applications may be cast as optimization or sampling problems in which the parameter-to-data map is provided as a black-box, derivatives may not be readily available and the evaluation of the map itself may be subject to noise. I will describe the derivation of mean-field (stochastic) dynamical systems which address such problems and show how particle approximations lead to derivative-free algorithms. I will overview some of the analysis of the resulting methods, link the work to parallel developments in consensus-based optimization, and describe open problems. The work will be illustrated throughout by examples from the physical sciences.