Information has significant impacts on decision-making arising in every decision, control, and learning problems in various contexts. The main thrust of my research is to investigate the interaction between information and decision-making in large networks in the presence of unknown and uncertain environments. My research has focused mainly on the theory and applications of large-scale decentralized stochastic systems and control, games, and learning, combining stochastic optimal control theory, game theory, incentive design, and learning theory. Some of my current research projects include:
Large stochastic games, teams, and their mean-field limits
Learning and approximations of near-optimal decisions
Stochastic incentive designs and data analytics
Learning in large games and teams
A class of decision-making problems where decision-makers have a common cost is called team problems. Such problems are quite general, with an abundance of applications involving many areas of operations research, applied mathematics, and engineering such as decentralized stochastic control, financial systems, networked control, cooperative systems, sensor networks, and energy market. Several real-life competitive game scenarios occur on a team basis. For example, in industries such as technology development and manufacturing, organizing employees into teams is an effective way to improve performance and enhance productivity compared to a single decision-maker arrangement. These can be modeled by viewing them as games problem among teams. The information structure characterizes the interaction of information and decision-making, which plays a pivotal role in the existence, and approximations of solutions. For example, the celebrated Witsenhausen's counterexample demonstrates the difficulty of obtaining optimal solutions, arising from considering a decentralized information structure. Depending on the information of the problem, finding the solution can be computationally intractable even for problems with a finite number of decision-makers. The complexity often exacerbates as the number of decision-makers increases. My research in this area establishes the existence, approximations, and structural results for optimal solutions to these problems.
S. Sanjari, N. Saldi, and S. Yüksel, "Optimality of Independently Randomized Symmetric Policies for Exchangeable Stochastic Teams with Infinitely Many Decision Makers", Mathematics of Operations Research, 2022 (pdf), (arXiv)
S. Sanjari and S. Yüksel, "Optimal Policies for Convex Symmetric Stochastic Dynamic Teams and their Mean-field Limit", SIAM Journal on Control and Optimization, 2021, v. 59, pp. 777-804 (pdf).
S. Sanjari, T. Başar, and S. Yüksel, "Isomorphism Properties of Optimality and Equilibrium Solutions under Equivalent Information Structure Transformations: Stochastic Dynamic Games and Teams'', SIAM Journal on Control and Optimization, to appear (2023) (arXiv-Games, arXiv-Teams).
S. Sanjari, N. Saldi, and S. Yüksel, "Nash Equilibria for Exchangeable Team against Team Games, their Mean Field Limit, and Role of Common Randomness", 2022 (arXiv)
In decision-making problems, decision-makers often do not exactly know their cost and dynamic structures. Decision-making and learning should take place given data on the underlying model for each decision-maker. A thread of work on learning studies such a setting. Can decision-makers adopt an algorithm that allows them to learn the uncertainties while making "optimal" decisions? My research in this area analyzes decentralized algorithms that lead to computationally efficient methods for learning near-optimal decisions leveraging infinite-dimensional statistical learning tools such as reproducing kernel Hilbert space (RKHS) technique, where optimal value functions and/or policies can be embedded in an infinite-dimensional Hilbert space whose elements can be represented via a positive-semi definite kernel. My main goal in this direction is to address challenges in numerical computations in the presence of big data.
B. Hou, S. Sanjari, N. Dahlin, S. Bose, U. Vaidya "Sparse Learning of Dynamical Systems in RKHS: An Operator-Theoretic Approach '', International Conference on Machine Learning (ICML 2023) (pdf).
B. Hou, S. Sanjari, N. Dahlin, S. Bose "Compressed Decentralized Learning of Conditional Mean Embedding Operators in Reproducing Kernel Hilbert Spaces'', AAAI Conference on Artificial Intelligence 2022 (pdf).
Another important class of decision-making problems corresponds to those that are hierarchical and often involve a leader and a collection of followers with possibly different goals. For these problems, the leader seeks a policy to induce the desired behavior among followers. As an example, consider a duopoly with government regulation that designs regulation strategies to incentivize competitive behavior among suppliers toward social welfare. It is not difficult to surmise that the leader can design a threat policy that seeks to penalize the followers heavily for marginally deviating from the leader's desired policy response from the followers. However, such policies are unsavory for a policy-maker; one seeks an incentive policy that levies a penalty on followers that is commensurate with the extent of deviation from the desired response. Does such a policy exist? My research in this area analyzes learning, robustness, and data-driven methods for designing leader's incentive policies for both finite populations and high populations for the followers.
S. Sanjari, S. Bose, and T. Başar, "Incentive Designs for Stackelberg Games with a Large Number of Followers and their Mean-Field Limits'', 2022 (arXiv)
S. Sanjari, S. Bose, and T. Başar, "Signaling-Based Robust for Incentive Designs with Randomized Monitoring'', World Congress of the International Federation of Automatic Control (IFAC 2023) (pdf)
Learning in multi-player games and teams is more challenging than in the single-player setting. In the multi-player learning setting, the players do not know the dynamics and costs' structures but they observe only their private cost realizations. In that, the independent (decentralized) learner does not necessarily share the history of their private actions. Lack of this knowledge leads to inefficient/insufficient exploration which might lead to incomplete learning since players cannot coordinate to explore the state-action space thoroughly, and hence, they face a non-stationary environment from the player's perspective. In addition to this difficulty which is common in games and teams, in the team setting, independent learners face an additional difficulty to learn a globally optimal solution which requires coordination among players to assist them in escaping from local optimal solutions. When we are dealing with large games and teams, this issue will be more prominent. My research in this area lies in two main settings. The first setting is on learning Nash equilibria for large games among teams, where my research studies learning algorithm with the guarantee of converging to a Nash equilibrium. The second setting is on the combination of equilibrium selection and learning for games, where a knowledgeable leader/designer desires to guide the players toward a specific optimal solution by affecting the dynamics and costs via selecting policies. This setting is inspired by applications in the market and incentive design.
"Learning Incentive Equilibrium for Stackelberg Games'' with A. Guclu, S. Bose, and T. Başar.
''Learning to Cooperate in Stochastic Teams with Guidance of a Knowledgeable Player'' with N. Dahlin, V. Subramanian, S. Bose.