Kishan Panaganti Badrinath

Research Overview

Design data-efficient scalable robust learning algorithms

for real-world sequential decision-making.

Most success stories of sequential decision-making learning algorithms are limited to very structured or simulated environments of real-world dynamical systems. One key weakness of sequential decision-making learning algorithms for their adoption to complex and unstructured real-world dynamical systems is the simulation-to-reality gap. It is the gap between the data-accessible environment (e.g., simulator, historical data) and the dynamically changing real-world environment. Broad mathematical tools of stochastic and robust optimization are powerful for formulating and solving sequential decision-making problems with simulation-to-reality gaps. However, addressing three main challenges is an active area of research:

Scalable learning algorithms for robust sequential decision-making.
Application or domain-specific (e.g. single vs. multi-agent) robust sequential decision-making formulations.
Excessive computing needs to achieve robustness with access to many perturbed environments, e.g., domain randomization assumes access to multiple-faceted simulators.

My research addresses these challenges by developing methodologies highlighting robustness features using principled approaches from stochastic and robust optimization tools (e.g., distributionally robust optimization, risk-averse and risk-sensitive analysis, inverse kinematics, statistical parametric methods, etc). Furthermore, I use the underlying structures of different frameworks to enable scalable sequential learning algorithms with data-efficient theoretical guarantees. I also emphasize reducing algorithms' computing needs by enabling learning frameworks with the prowess of mathematical optimization tools.

Through my research, I aim to deploy robust decision-making learning algorithms on different real-world applications in autonomous systems, robotics, power systems, recommendation systems, healthcare, and general safety-critical engineering systems. More specifically, to enable deployments for such application-specific problems, I have developed robust learning algorithms using different frameworks such as reinforcement learning (RL), imitation learning (IL), contextual bandits (CB), multi-agent RL (MARL), and RL from human feedback (RLHF).

Research Taxonomy of My Works: I have deployed scalable learning algorithms in these frameworks -- that is, data-driven approaches using general function architecture computations -- to many application-specific problems (matched with color-codings).

Page updated

Report abuse