Weapon-Target Assignments (WTA) is a well-known optimization problem that assigns weapons to targets to maximize overall strike effectiveness. WTA in a defense scenario is a challenging problem in defense systems due to its nonlinear problem settings and practical constraints, such as engagement time windows. In this constrained WTA, decision-making should consider the trade-off between urgent targets and high-threat targets. Moreover, this complex decision-making needs to be completed within the allowable execution time. These characteristics make conventional approaches based on mixed-integer linear programming (MILP) and heuristics highly limited in defense scenarios. In contrast, AI-based methodologies are emerging as promising new solutions.
Integration with Engagement Simulator
In this example, WTA via RL generates a more resource-efficient assignment plan, as its diverse assignments yield a "shoot-look-shoot" strategy.
WTA via Heuristic (Distance-based)
WTA via RL
Dynamic targeting (DT) has recently emerged as a critical paradigm for Earth-observing satellite operations, enabling better utilization of limited observational opportunities. DT for multiple satellite operations, such as formation-flying satellites, is an emerging concept that has not been fully explored. In this problem setting, each satellite chooses where to observe via its primary sensor based on the information from the look-ahead sensor. However, when multiple satellites are considered, each agent must select targets that are both diverse and high-reward. For instance, if all agents focus on the same target within their observation tracks, the collected information becomes largely redundant. Learning cooperative strategies that enable satellites to select diverse yet high-value targets is a challenging problem, and MARL can be a promising solution method.
Operation Examples
Satellite agents need to select both diverse and high-reward targets (Trade-off between diverse and greedy selections)
Earth Observation via Dynamic Targeting Heuristic (nSat = 5)
Earth Observation via Multi-agent Reinforcement Learning (nSat = 5)