Research

Our research goal is to provide scalable, interpretable, and data-eﬃcient learning and control approaches that enable agents in a cyber-physical system to reason over the learning processes, coordinate eﬀectively with other agents, acquire skills eﬃciently to complete complex tasks in the presence of competitors or adversaries, and swiftly adapt to novel tasks and environments. We further seek to advance the next generation of frontier foundation models—including large language models (LLMs), vision-language models (VLMs), vision-language-action models (VLAs), and multimodal large language models (MLLMs)—by endowing them with causal reasoning, logical inference, long-horizon planning, and principled decision-making capabilities. By integrating advances in machine learning, control theory, and embodied intelligence, we aim to develop trustworthy AI systems that can understand, reason about, and safely interact with complex cyber-physical environments.

Fundamental challenges: Despite the fact that artificial intelligence boosted with data-driven methods (e.g., deep neural networks) has surpassed human-level performance in various tasks, its application to cyber-physical systems still faces fundamental challenges.

Lack of interpretability: Few of the cyber-physical systems using data-driven methods can explain their behaviors and reason over the decision-making process in a way that humans can understand. In order to work seamlessly with humans, these systems need to communicate and explain their motivations, strategies and competence in performing various tasks to humans. Interpretability is especially needed for safety-critical tasks.
Intensive need for data: Most data-driven methods in cyber-physical systems are data-intensive and lack commonsense knowledge and reasoning that are natural to humans. For example, reinforcement learning tasks often require extensive exploration of the environment to achieve satisfactory performance. On the other hand, the available data from performing such tasks are often limited or costly to obtain.
Lack of logical and causal reasoning: Most current AI-based learning and control modules embedded in CPS are based on statistical correlation instead of causality for the decision-makingprocess.

Vision and research summary: I address these fundamental challenges by explaining, guiding and verifying cyber-physical systems, taking into account limited availability of simulated and real data, the expressivity of high-level knowledge representations and the uncertainties of the underlying model. To that end, my approaches weave together the theories and techniques in machine learning, formal methods and control theory.

Learning high-level knowledge from data: I develop computationally efficient methods to learn high-level knowledge representations from data. Such high-level knowledge representations need to be both understandable and informative to humans, and amenable to automated reasoning. For example, I have used temporal logic formulas, finite state machines, and causal diagrams to represent such high-level knowledge.
Guiding cyber-physical systems with high-level knowledge: High-level knowledge can provide contextual information for guiding the cyber-physical systems towards better task performance. I conceptualize and develop a framework that enables a reinforcement learning agent to reason over its exploration process and distill high-level knowledge for effectively guiding its future explorations. Such knowledge can also be transferred from an original well-studied task to a new task, if these two tasks share some logical similarities. The resulting performance shows that the high-level knowledge can improve the sampling efficiency of the learning agent by up to two orders of magnitude.
Learning and control of cyber-physical systems with high-level specifications: I provide safety and correctness guarantees for cyber-physical systems through formal verification and provably correct synthesis. While there has been research on verifying physical systems with neural networks in the decision and control loop, such verification has been limited to properties without a temporal evolution. I have developed verification and provably correct synthesis methods with high-level temporal logic specifications for non-linear hybrid systems, multi-agent systems with intermittent communication, etc. The developed methods have wide applications in robotic systems, power systems, smart buildings and biological systems.

Direction I: learning high-level knowledge representations from data

Learning Temporal Logic-Based Causal Diagram From Data:

Causal discovery in time-series datasets is critical for understanding complex systems, especially when the effectiveness of causal relationships depends on both the duration and magnitude of the cause. We introduce a novel framework for causal discovery based on Signal Temporal Logic (STL), enabling the extraction of interpretable causal diagrams (STL-CD) that explicitly capture these temporal dynamics.

Nasim Baharisangari, Yucheng Ruan, Chengcheng Zhao, Zhe Xu, A Logic-Based Approach to Causal Discovery: Signal Temporal Logic Perspective, Proc. International Joint Conferences on Artificial Intelligence (IJCAI), Montreal, Canada, August, 2025.

Temporal Logic Inference with A priori Information

Applied the temporal logic inference algorithm in the classification and identification of robot arm movements of Phantom Omni haptic devices. The motivation for this work is to enable robots to learn from human demonstrations and generate temporal logic specifications from the demonstration data. The inferred signal temporal logic (STL) formulae can classify different robot arm movements, predict sequential robot arm movements and identify different goals and obstacles during different time intervals. Some a prior information are included in the inference process, so that the STL formulae are composed of atomic predicates with a priori assigned meaning.

Zhe Xu, Calin Belta and Agung Julius, Temporal Logic Inference with Prior Information: An Application to Robot Arm Movements, IFAC Conference on Analysis and Design of Hybrid Systems (ADHS), October 2015.

Cyber-physical System Fault Detection and Privacy Preservation

Designed observation maps in the form of temporal logic formulae for fault detection and privacy preservation with temporal and spatial uncertainties in a provably correct fashion. Implemented the method on the simulation model of a smart building testbed for detecting the open window fault while preserving multiple privacy conditions of the room occupancy.

Zhe Xu, Sayan Saha and Agung Julius, Provably correct design of observations for fault detection with privacy preservation, IEEE Conference on Decision and Control (CDC), December 2017.

Temporal Logic Inference for Multi-agent Group Behavior Analysis

Defined a new type of Signal Temporal Logic specifically for multi-agent systems: census signal temporal logic (CensusSTL). The CensusSTL consists of an "inner logic" formula that characterizes a consistent, frequent and specific task and an "outer logic" formula that characterizes the pattern of the number of agents in certain subgroups whose behaviors satisfy the "inner logic" formula. Designed an inference method to discover subgroups of agents and their behavior patterns in the form of CensusSTL based on spatial–temporal data of all agents in the group.

Zhe Xu and Agung Julius, Census Signal Temporal Logic Inference for Multiagent Group Behavior Analysis, IEEE Transactions on Automation Science and Engineering, vol. 15(1), pp. 264-277, October 2016.

Model Discrimination for Biological Systems

Designed a method for discriminating among competing models for biological systems by learning temporal logic formulae from data obtained by simulating the models. Applied this method to find dynamic features of epidermal growth factor (EGF) - induced extracellular regulated kinase (ERK) activation that are strictly unique to positive vs. negative feedback model.

Zhe Xu, Marc Birtwistle, Calin Belta and Agung Julius, A Temporal Logic Inference Approach for Model Discrimination, IEEE Life Sciences Letters, vol. 2(3), pp. 19-22, December 2016.

Zhe Xu, Marc Birtwistle, Calin Belta and Agung Julius, Temporal logic inference for model discrimination: an application on the ERK pathway, in Proc. Foundations of Systems Biology in Engineering (FOSBE), Boston, MA, 2015.

Graph Temporal Logic Inference from Spatial-Temporal Data

Dened graph temporal logic which can express spatial-temporal properties on a graph. Developed a method to infer graph temporal logic formulas from data for classication and identication. Implemented the proposed approach to classify the graph patterns of tensile specimens built from selective laser sintering process with varying strengths, and to identify informative spatial-temporal patterns from experimental data of the selective laser sintering cooldown process and simulation data of a swarm of robots.

Zhe Xu, Alexander Nettekoven, A. Agung Julius and Ufuk Topcu, Graph Temporal Logic Inference for Classification and Identification, to appear in IEEE Conference on Decision and Control (CDC), December 2019.

Direction II: reinforcement learning and transfer learning utilizing high-level knowledge

The sampling efficiency and performance of reinforcement learning can be improved if some high-level knowledge (e.g., temporal logic formulas, finite-state machines) can be incorporated in the learning process.

Learning High-level Knowledge for Improving Reinforcement Learning

I am interested in joint inference of high-level knowledge and policies for reinforcement learning. I infer high-level knowledge (temporal logic formulas, finite-state machines, etc.) from a set of labeled trajectories collected from the early stage of reinforcement learning for a non-Markovian task, and utilizing the inferred high-level knowledge to improve the performance of reinforcement learning at later stages. We compare our JIRP (Joint Inference of Reward Machines and Policies) approach with three baselines: QAS (q-learning in augmented state space), HRL (hierarchical reinforcement learning), and DDQN (deep reinforcement learning with double qlearning). The experiments show that the proposed JIRP approach can improve the sampling efficiency for at least 20 times.

Zhe Xu, Ivan Gavran, Yousef Ahmad, Rupak Majumdar, Daniel Neider, Ufuk Topcu and Bo Wu, ``Joint Inference of Reward Machines and Policies for Reinforcement Learning", accepted to International Conference on Automated Planning and Scheduling (ICAPS), 2020.

Learning High-level Knowledge for Transfer Learning

The inferred temporal logic formulas can be also transferred from a source task to a target task if these tasks are logically similar. While the transfer of logical knowledge without a temporal context has been studied in reinforcement learning, i.e., learning for sequential decision-making, transfer between tasks temporal context has attracted significantly less attention. I concretized similarity between temporal tasks through a notion of “logical transferability”, and developed a transfer learning approach between different yet similar temporal tasks. The experimental results show, depending on how similar the source task and the target task are, that the sampling efficiency for the target task can be improved by up to one order of magnitude by performing reinforcement learning in the extended state space, and further improved by up to another order of magnitude using the transferred extended reward and policy information.

Zhe Xu and Ufuk Topcu, Transfer of Temporal Logic Formulas in Reinforcement Learning, International Joint Conferences on Artificial Intelligence (IJCAI), August 2019.

Direction III: system verification with respect to temporal logic specifications

Robust Testing (Verification) for Power System Cascading Failure Mitigation

Complex systems such as power systems can be very vulnerable to disturbances or intended attacks. How can we make sure that these systems are stable or behave as they are designed to behave? If the desired behavior is specified in the form of temporal logic formulae such as ``line currents should never exceed the threshold value for more than 0.1 seconds'', then I can perform robust testing of temporal logic specifications for the nonlinear hybrid system model of power systems. I first proposed the algorithm of computing the bounded disturbance local discrepancy function for a general nonlinear system and then extended the method to hybrid systems. I performed robust testing on a three-machine power system model for the Italian blackout in 2003 and the IEEE 39-bus benchmark system.

Zhe Xu, Agung Julius and Joe H. Chow, Robust Testing of Cascading Failure Mitigations Based on Power Dispatch and Quick-Start Storage, IEEE Systems Journal, April 2017.

Direction IV: controller synthesis with respect to temporal logic specifications

Energy Storage Controller Synthesis for Power Systems with Temporal Logic Specifications

Designed a controller synthesis framework to control the energy storage systems with metric temporal logic (MTL) specifications. Designed both the feedforward and the feedback controllers for the nonlinear DAE (differential algebraic equations) model of power systems. Applied the controller synthesis methods to regulating grid frequencies with energy storage systems, so that the system trajectories satisfy the MTL specifications about the grid frequency deviations and the wind turbine generator rotor speed deviations.

Zhe Xu, Agung Julius and Joe H. Chow, Energy Storage Controller Synthesis for Power Systems With Temporal Logic Specifications, IEEE Systems Journal, October 2017.

Coordinated Control of Wind Turbine Generator and Energy Storage System for Frequency Regulation

Proposed the stochastic control bisimulation function, which bounds the divergence of the trajectories of the stochastic control system and the diffusionless deterministic control system in a probabilistic fashion. Dsigned a feedforward controller by solving an optimization problem for the nominal trajectory of the deterministic control system with robustness against initial state variations and stochastic uncertainties. Then a feedback control law is learned from the state and input data of the feedforward controller. Applied the proposed approach in controlling a wind farm and an energy storage system for frequency regulation with provable probabilistic safety guarantees in the stochastic environment of wind power generation.

Zhe Xu, Agung Julius and Joe H. Chow, Coordinated Control of Wind Turbine Generator and Energy Storage System for Frequency Regulation under Temporal Logic Specifications, IEEE American Control Conference (ACC), June 2018.

Advisory Temporal Logic Inference and Controller Design in Human-Robot Collaboration

Designed a method to learn (infer) and refine a set of advices from the trajectories generated in the successful and failed attempts in a task or game, in the form of advisory signal temporal logic (STL) formulae. Each advice consists of an advisory motion STL formula that characterizes the spatial temporal pattern of the motion as a feature of success and an advisory selection STL formula as a criterion for the environment to select the advice. Designed an advisory controller that can drive the robots to satisfy an advisory motion STL formula when the past environment trajectory satisfies the corresponding advisory selection STL formula. The advisory controller can advise or guide the human operators or the robots for better performance with the shared autonomy between the human operator and the controller.

Zhe Xu, Sayan Saha, Botao Hu, Sandipan Mishra and Agung Julius, Advisory Temporal Logic Inference and Controller Design in Human-Robot Collaboration, IEEE Transactions on Automation Science and Engineering, June, 2018.

Synthesize Controller with Temporal Logic Specifications with Intermittent Communication

Developed a controller synthesis approach for a multi-agent system with intermittent communication and metric temporal logic specications. Adopted a leader-follower scheme, where a mobile leader with absolute position sensors switches among a set of followers without absolute position sensors to provide each follower with intermittent state information. The followers are to asymptotically reach a predetermined consensus state. The approach iteratively computes the optimal control inputs such that the leader satisfies the MTL specifications, while guaranteeing stability and consensus of the followers.

Zhe Xu, Federico Zegers, Bo Wu, Warren Dixon and Ufuk Topcu, Controller Synthesis For Multi-Agent Systems with Intermittent Communication: A Metric Temporal Logic Approach, to appear in 57th Annual Allerton Conference on Communication, Control, and Computing, 2019.

Differentially Private Controller Synthesis with Temporal Logic Specifications

Proposed a dierentially private controller synthesis approach for multi-agent systems subject to high-level specications expressed in metric temporal logic. While guaranteeing dierential privacy of each agent, the controller is also synthesized to ensure a probabilistic guarantee for satisfying the metric temporal logic specication.

Zhe Xu, Kasra Yazdani, Matthew T. Hale and Ufuk Topcu, Differentially Private Controller Synthesis With Metric Temporal Logic Specifications, to appear in IEEE American Control Conference (ACC), 2020.

Google Sites

Report abuse