Develop reinforcement learning methods to achieve Nash equilibrium policies without solving PDEs, span both continuous and discrete scenarios addressing safety constraints, collision avoidance, and density estimation.
Description: We propose a deep reinforcement learning (DRL) algorithm that achieves population-dependent Nash equilibrium from any initial distributions, without the need for averaging or sampling from history Q values or policies, inspired by Munchausen RL and Online Mirror Descent.
Description: built on the last work, we further develop an algorithm that is adaptable to sources of common noise.
<Submitted to L4DC 2025>
Creating real-time estimation techniques for agents and environmental targets under unknown inputs, involving consensus algorithms, dynamic topologies, and sensor fusion, intermittent observations.
Description: We propose an efficient algorithm that achieves an unbiased and optimal solution comparable to filters with full information about other agents. This is accomplished through the use of information filter decomposition and the fusion of inputs via covariance intersection. Our algorithm also preserves agents' privacy by avoiding the sharing of explicit observations and system equations.
Description: This paper addresses input and state estimation for autonomous systems, unifying continuous and discrete cases using the Expectation-Maximization (EM) algorithm. By incorporating event priors as constraints, it formulates inequality optimization problems to compute gain matrices or dynamic weights, achieving optimal input estimation with reduced variance and improved decision-making.
Developed a localization and navigation system for robots in Tencent Internet Data Centers (IDC) to guide robots to transfer shelves, navigate, and align the server cabinet, coupled with Lidar, IMU, wheel odometer in SLAM, assisted by QR code in the final docking process.
Developed a loose-coupled framework that fuses IMU, SLAM, GNSS and other sensors separately, which tolerates single sensor failure during operation and achieves seamless and continuous positioning; with two-layer error-state Kalman Filter (ESKF) to optimize SLAM, IMU, and GPS position and Doppler, in which sharing the same nominal state.
Developed a coupled GNSS-IMU positioning framework on Android smartphones, utilized the pseudorange double-difference model (PDD) to eliminate common errors, and decoupled the pseudorange and velocity measurements based on the short-baseline hypothesis. Furthermore, a further work realized a joint heading estimation using Pedestrian Dead Reckoning (PDR) and GNSS Doppler.