Manufacturing Data Science (MDS) aims to leverage statistical methods, machine learning, and optimization methods to develop predictive and decision-making models that enhance production efficiency, process yield, and equipment health.
Explainable Artificial Intelligence (XAI) research aims to uncover the decision logic of black-box models, enabling AI systems to be more transparent, interpretable, and trustworthy in high-risk and critical decision-making scenarios.
Our research covers a broad range of key topics in manufacturing, spanning multiple levels—including resource, machine, and process levels—and focuses on prediction and decision-making problems at each level. The main research areas include:
Production Scheduling and Decision Optimization (Resource Level):
Developing efficient, flexible, and robust production and resource management strategies under complex resource constraints and dynamic demand conditions.
Machine Fault Prediction and Health Management (Machine Level):
Leveraging long-term sensor data to build prognostic models, health indicators, and maintenance decision strategies to reduce unplanned downtime and maintenance costs.
Mechanical System Modeling, Parameter Estimation, and Monitoring (Machine Level):
Constructing physics-informed dynamic models for mechanical and electromechanical systems, together with key parameter estimation and online monitoring, to support degradation assessment, anomaly detection, and performance compensation.
Process Diagnostics, Root Cause Analysis, and Parameter Optimization (Process Level):
Applying data science methods to rapidly identify the root causes of process abnormalities and quality issues, and further tuning critical process parameters to improve process stability, production efficiency, and yield.
Quality Monitoring and Defect Prediction (Process Level):
Integrating statistical methods and deep learning techniques to monitor product quality and identify defects, ensuring consistent and reliable product quality.
From a methodological perspective, we employ a wide range of advanced data science and artificial intelligence techniques, including:
Statistical Machine Learning Methods:
Supervised, Unsupervised, Semi-Supervised, and Self-Supervised Learning;
Bayesian, Physics-Informed, and Causal Machine Learning;
Explainable Artificial Intelligence (XAI) and Uncertainty Quantification (UQ).
Decision-Making and Optimization Methods:
Reinforcement Learning, Metaheuristic Algorithms, Bayesian Optimization, and Mathematical Programming.
These research topics and methodologies have been successfully applied to real-world industrial settings involving highly complex, reliable, and safety-critical engineering systems, including semiconductor manufacturing and packaging, TFT-LCD manufacturing, servo motor control, chemical engineering processes, and renewable energy systems.
Our goal is to assist enterprises operating in highly dynamic and complex manufacturing environments in developing reliable, deployable, and decision-supportive predictive and decision-making systems, thereby enhancing production yield, process stability, and overall operational performance.
Microgrids integrate diverse energy sources, including solar power, wind energy, battery energy storage systems, and diesel generators, and must perform real-time and effective energy dispatch under significant uncertainty. By employing load and renewable energy forecasting models, such as time-series analysis and deep learning techniques, and combining them with optimization or reinforcement learning–based decision strategies, microgrids can determine optimal charging and discharging schedules for energy storage, on/off decisions for diesel generators, and overall energy allocation policies. These data-driven and decision-oriented approaches enable microgrids to reduce fuel consumption and carbon emissions, increase energy self-sufficiency, and maintain stable and reliable operation under highly volatile and uncertain operating conditions.
Representative Publication:
Y.-C. Hsu, Y.-H. Hung, and C.-Y. Lee, “Robust ensemble forecasting and deep reinforcement learning for energy management on islanded microgrids,” International Journal of Electrical Power and Energy Systems, vol. 173, p. 111405, 2025. [PDF]
In highly customized manufacturing environments with significant demand volatility, fully automated scheduling models often fail to meet the flexibility required on the shop floor. As a result, semi-automated production scheduling systems have emerged as a more practical and effective solution. By integrating optimization algorithms, heuristic rules, and human–machine collaborative interfaces, such systems can rapidly generate feasible schedules while supporting subsequent manual refinement.
Based on order characteristics, machine capabilities, and due-date constraints, the system automatically produces multiple candidate scheduling solutions and provides critical decision-support information, including bottleneck machine utilization, estimated completion times, and potential scheduling conflicts. This enables planners to adjust schedules according to their expertise and real-time operational conditions, such as rush orders, machine maintenance, or other unforeseen disruptions.
Representative Publication:
C.-Y. Lee, C.-Y. Ho, Y.-H. Hung, and Y.-W. Deng, “Multi-objective genetic algorithm embedded with reinforcement learning for petrochemical melt-flow-index production scheduling,” Applied Soft Computing, vol. 159, p. 111630, 2024. [PDF]
Predictive maintenance aims to detect equipment abnormalities and anticipate potential failure times in advance by leveraging sensor data and behavioral models, thereby reducing unplanned downtime and maintenance costs. By integrating multi-source data—such as vibration signals, current waveforms, temperature measurements, and production line status—health indicators and degradation models of equipment can be constructed. These models are further combined with time-series analysis, anomaly detection, and deep learning techniques to predict equipment degradation trends.
When abnormal patterns or significant deterioration in health indicators are detected, the system can promptly alert on-site engineers. Moreover, decision-making methods, such as reinforcement learning, can be employed to recommend optimal maintenance timing and repair prioritization. These approaches effectively mitigate equipment failure risks, extend equipment lifespan, and enhance overall production line stability, making predictive maintenance a core enabling technology for building reliable production systems in smart manufacturing.
Representative Publication:
Y.-H. Hung, H.-Y. Shen, and C.-Y. Lee, “Deep reinforcement learning-based preventive maintenance for repairable machines with deterioration in a flow line system,” Annals of Operations Research, pp. 1–21, 2024. [PDF]
Mechanical parameter estimation and monitoring aim to continuously track variations in critical mechanical parameters using sensor data and system identification techniques, enabling accurate assessment of machine conditions and early detection of abnormalities. By integrating multiple sensing modalities—such as vibration, torque, displacement, and electrical current signals—this approach employs parameter estimation methods, state-space modeling, and statistical machine learning techniques to dynamically estimate key mechanical parameters, including friction coefficients, stiffness, damping, backlash, and external loads.
Long-term monitoring of the temporal evolution of these parameters allows for effective identification of incipient wear and degradation phenomena. The estimated parameters further serve as essential inputs for controller retuning, performance optimization, and predictive maintenance decision-making. Overall, this methodology enhances operational stability, improves control performance, and significantly reduces the risk of unexpected equipment downtime.
Representative Publication:
Y.-H. Hung, C.-Y. Lee, C.-H. Tsai, and Y.-M. Lu, “Constrained particle swarm optimization for health maintenance in three-mass resonant servo control system with LuGre friction model,” Annals of Operations Research, vol. 311, pp. 131–150, 2022. [PDF]
Mechanical system modeling aims to describe the dynamic behavior of equipment and mechanical systems by integrating physics-based models with data-driven approaches, thereby supporting downstream tasks such as control, prediction, and health management. By combining sensor measurements, operating conditions, and motion characteristics, this approach formulates dynamic equations, friction models, or state-space representations of the system. These models are further enhanced using identification techniques—including parameter estimation, system identification, and deep learning methods—to capture nonlinearities and uncertainties inherent in real-world mechanical systems.
Through this modeling process, system trajectories can be reconstructed, performance degradation can be predicted, and sources of modeling errors can be systematically analyzed. The resulting models provide essential foundations for controller tuning, performance optimization, and predictive maintenance, and play a critical role in enabling digital twins. Mechanical system modeling therefore serves as a cornerstone of intelligent manufacturing, facilitating deeper understanding of equipment behavior and improving operational reliability.
Representative Publication:
R.-Q. Hong, Y.-H. Hung, and C.-Y. Lee, “Correcting model misspecification by domain-adaptive physics-informed neural networks with interpretable auto-differentiation-based correction,” working paper.
Process parameter optimization aims to identify optimal operating conditions that simultaneously improve yield, efficiency, and process stability by leveraging statistical and machine learning methods. By integrating historical process records, sensor data, and quality measurement results, process behavior models can be constructed to analyze the influence of key parameters on product quality and system performance. These models are then combined with optimization techniques or design of experiments (DoE) to identify efficient and feasible parameter configurations.
This data-driven optimization approach significantly reduces trial-and-error costs, shortens process tuning cycles, and enhances process consistency and product quality. As such, process parameter optimization plays a crucial role in enabling high-yield and stable production in intelligent manufacturing systems.
Representative Publication:
B.-R. Chen, Y.-H. Hung, and C.-Y. Lee, “Causal inference of policy learning for manufacturing process parameters,” INFORMS Journal on Data Science, under review.
In-line process parameter prediction and monitoring aims to continuously track critical operating conditions during production and provide early warnings before deviations occur. By integrating sensor signals, historical process records, and quality measurement data, adaptive predictive models can be developed to dynamically estimate key process parameters such as temperature, pressure, flow rate, and other operational variables in real time.
By combining time-series analysis, deep learning, and anomaly detection techniques, the system is able to identify abnormal trends, monitor process stability, and determine whether operating conditions are drifting away from their optimal ranges. This approach enables timely adjustments of process settings, helping to prevent quality degradation and equipment risks while improving production efficiency and process consistency. As a result, in-line predictive monitoring serves as a foundational capability for real-time quality assurance in intelligent manufacturing systems.
Representative Publication:
C.-Y. Lee, C.-S. Wu, and Y.-H. Hung, “In-line predictive monitoring framework,” IEEE Transactions on Automation Science and Engineering, vol. 18, no. 4, pp. 1669–1678, 2020. [PDF]
Defect detection focuses on the automatic identification of product or process defects using imaging, sensor data, and machine learning techniques, with the goals of improving quality consistency and reducing reliance on manual inspection. By integrating computer vision, deep learning models, and statistical feature analysis, defect detection systems are capable of identifying surface scratches, voids, cracks, foreign objects, and process-induced anomalies.
These systems operate in-line and in real time, enabling rapid localization of suspicious regions, accurate defect classification, and seamless linkage to process parameters to support downstream root cause analysis. As a result, defect detection techniques significantly improve yield, shorten inspection cycles, and strengthen process quality control, forming a critical foundation for quality assurance in intelligent manufacturing systems.
Representative Publication:
T.-T. Hsieh, C.-Y. Lee, Y.-H. Hung, P.-C. Shen, and T. Yang, “Unpaired image denoising and fusion with adaptive multi-branch task UNet for semiconductor packaging defect recognition,” IEEE Transactions on Automation Science and Engineering, 2025.
Our research focuses on developing methods that enhance the interpretability and trustworthiness of AI models, encompassing the following core directions:
Local and Global Explanation Methods:
We advance explanation techniques such as Local Interpretable Model-Agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP) by constructing stable local surrogate models and adopting diverse perturbation strategies (e.g., nonparametric estimation and sampling). These approaches reveal classification boundaries or regression characteristics and feature contributions across different regions of the input space, thereby improving the consistency, transparency, and reliability of model explanations.
Integration of Uncertainty Quantification, Counterfactual Explanations, and Machine Unlearning:
We employ uncertainty quantification to identify high-risk samples and model vulnerabilities, and use counterfactual explanations to analyze the sensitivity of decision boundaries and feasible directions for adjustment. These risk signals are further utilized in machine unlearning to identify sources of bias or data points that should be removed, enabling models to maintain robustness and transparency in dynamic data environments.
Explanations for Temporal, Spatial, and Sequential Decision-Making Systems:
For time-series data, spatially structured data, and dynamic decision-making systems such as reinforcement learning, we develop explanation methods that capture temporal dependencies, spatial interactions, and policy evolution. The emphasis is on interpreting behavioral changes across decision stages, sensitivity to critical states, and the interpretability of long-term strategies in high-dimensional environments, supporting transparent decision-making in complex systems.
Through the development of these methods, our goal is to build trustworthy AI systems suitable for high-risk and mission-critical application domains. These systems aim to achieve higher levels of safety, interpretability, and practical value, while enabling enterprises and organizations to adopt AI technologies more effectively in real-world settings.
Many existing Explainable Artificial Intelligence (XAI) methods assume that a model exhibits local linear decision boundaries or regression behavior. However, in real-world data, local nonlinearity is often pronounced, causing traditional explanation methods to produce substantial approximation errors when modeling local classification boundaries or regression functions. In addition, perturbation-based explanation techniques typically yield single-point feature contribution estimates, failing to reflect the inherent uncertainty in explanations.
To address these limitations, we propose BMB-LIME, an explanation method designed to capture nonlinear local decision boundaries or regression structures while explicitly quantifying uncertainty in feature contributions. BMB-LIME constructs local surrogate models using weighted Multivariate Adaptive Regression Splines (MARS) and integrates bootstrap aggregating with Bayesian uncertainty estimation. This design enables a more accurate and robust characterization of local decision behavior, providing both high-fidelity explanations and uncertainty-aware feature attributions.
Representative Publication:
Y.-H. Hung and C.-Y. Lee, “BMB-LIME: LIME with modeling local nonlinearity and uncertainty in explainability,” Knowledge-Based Systems, vol. 294, p. 111732, 2024. [PDF]
As complex black-box machine learning models are increasingly deployed in high-risk domains such as healthcare and criminal justice, the reliability and stability of model explanations have become critically important. While Local Interpretable Model-Agnostic Explanations (LIME) is a widely adopted local explanation method, it often suffers from instability due to randomly generated out-of-distribution perturbations. These perturbations can lead to poor local fidelity, inconsistent explanations, and vulnerability to adversarial manipulation.
To address these limitations, we propose KDLIME, an improved local explanation method that extends LIME through K-nearest neighbor (KNN)–based kernel density estimation. Instead of generating perturbations arbitrarily, KDLIME samples perturbation points from the in-distribution neighborhood of the target instance. This distribution-aware perturbation strategy enables the local surrogate model to fit explanations in a more representative and less biased region of the data space.
Representative Publication:
Y.-H. Hung and C.-Y. Lee, “KDLIME: KNN-kernel density-based perturbation for local interpretability,” in Proceedings of the Workshops of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2024): XKDD 2024, Vilnius, Lithuania, Sep. 9–13, 2024.
Reinforcement Learning (RL) is an artificial intelligence paradigm that learns optimal decision-making policies through interactions with an environment. Our research focuses on designing decision models that can autonomously learn effective strategies in uncertain and high-dimensional environments, with core objectives centered on improving sample efficiency, collaboration, interpretability, and trustworthiness. Our work spans the following research directions:
Multi-Agent Reinforcement Learning (MARL) and Communication Mechanisms
We investigate how multiple agents collaborate, compete, and coordinate in shared or partially observable environments. In particular, we study learnable and adaptive communication strategies that enable agents to exchange critical information, establish coordinated behaviors, and improve overall system performance. Representative application domains include energy dispatch, traffic control, and multi-machine coordination in manufacturing systems.
Representative References:
- J. N. Foerster, Y. M. Assael, N. de Freitas, and S. Whiteson, “Learning to communicate with deep multi-agent reinforcement learning,” in Proceedings of the 30th International Conference on Neural Information Processing Systems (NeurIPS 2016), Barcelona, Spain, 2016, pp. 2145–2153. [PDF]
- C. Zhu, M. Dastani, and S. Wang, “A survey of multi-agent deep reinforcement learning with communication,” Autonomous Agents and Multi-Agent Systems, vol. 38, no. 1, Art. no. 4, 2024. [PDF]
Offline-to-Online Reinforcement Learning
We develop RL methods that initialize policies from offline datasets under limited interaction budgets, and subsequently refine them through safe and efficient online learning. Key challenges addressed include distribution shift, policy collapse prevention, and ensuring stability and convergence during real-world deployment. This direction is particularly relevant to manufacturing systems, energy systems, and other safety-critical dynamic environments.
Representative References:
- X. Wen, X. Yu, R. Yang, H. Chen, C. Bai, and Z. Wang, “Towards robust offline-to-online reinforcement learning via uncertainty and smoothness,” Journal of Artificial Intelligence Research, vol. 81, pp. 481–509, 2024. [PDF]
- S. Guo, L. Zou, H. Chen, B. Qu, H. Chi, P. S. Yu, and Y. Chang, “Sample efficient offline-to-online reinforcement learning,” IEEE Transactions on Knowledge and Data Engineering, vol. 36, no. 3, pp. 1299–1310, 2024. [PDF]
Explainable Reinforcement Learning (XRL)
We study methods to reveal the decision logic underlying learned policies, including behavior attribution, value function analysis, counterfactual explanations, and policy sensitivity analysis. The goal is to ensure that RL models are not only performant but also interpretable, enabling transparent decision-making, effective human–AI collaboration, and improved safety and controllability in critical applications.
Representative Reference:
- S. Milani, N. Topin, M. Veloso, and F. Fang, “Explainable reinforcement learning: A survey and comparative review,” ACM Computing Surveys, vol. 56, no. 7, Art. no. 168, pp. 1–36, 2024. [PDF]
Trustworthy Reinforcement Learning
This research direction focuses on the reliability and deployability of reinforcement learning models in real-world decision-making scenarios. It encompasses three core aspects:
1. Robustness against perturbations and uncertainties,
2. Safety-aware reinforcement learning with explicit constraints and risk considerations, and
3. Generalization to both in-domain and out-of-domain environments.
The objective is to ensure that learned policies exhibit stable, predictable, and verifiable behaviors even in uncertain, dynamic, and high-risk environments.
Representative Reference:
- M. Xu, Z. Liu, P. Huang, W. Ding, Z. Cen, B. Li, and D. Zhao, “Trustworthy reinforcement learning against intrinsic vulnerabilities: Robustness, safety, and generalizability,” arXiv preprint arXiv:2209.08025, 2022. [PDF]
Overall, these methodological advances aim to establish reinforcement learning models that are more robust, sample-efficient, interpretable, and trustworthy, thereby advancing the theoretical foundations and methodological depth of autonomous decision-making systems.
Trustworthy Reinforcement Learning
Uncertainty Quantification (UQ) aims to estimate the sources of uncertainty inherent in model predictions and to assess the reliability and risk of models under different data conditions. Our research focuses on developing reliable uncertainty estimation methods for data-scarce settings, distributional shifts, and high-risk decision-making environments, with the goal of improving model robustness, interpretability, and decision safety. Our work covers the following research directions:
Uncertainty Disentanglement
We investigate how to distinguish between epistemic uncertainty, arising from model structure, parameter uncertainty, or out-of-distribution (OOD) inputs, and aleatoric uncertainty, originating from inherent noise and randomness in the data. By disentangling these uncertainty sources, we aim to more accurately identify and assess the origins of model risk.
Representative References:
- J. Gawlikowski, C. R. N. Tassi, M. Ali, J. Lee, M. Humt, J. Feng, A. Kruspe, R. Triebel, P. Jung, R. Roscher, et al., “A survey of uncertainty in deep neural networks,” Artificial Intelligence Review, vol. 56, no. Suppl. 1, pp. 1513–1589, 2023. [PDF]
- B. Mucsányi, M. Kirchhof, and S. J. Oh, “Benchmarking uncertainty disentanglement: Specialized uncertainties for specialized tasks,” in Advances in Neural Information Processing Systems, vol. 37, Curran Associates, Inc., 2024, pp. 50972–51038. [PDF]
Uncertainty under Distribution Shift and High-Risk Sample Identification
We study the behavior of predictive uncertainty under distribution shift, including cross-domain and cross-process scenarios. Uncertainty indicators are leveraged to identify high-risk samples, failure regions, and model vulnerabilities, supporting continuous model monitoring, early drift detection, and adaptive model updates.
Representative References:
- Y. Ovadia, E. Fertig, J. Ren, Z. Nado, D. Sculley, S. Nowozin, J. Dillon, B. Lakshminarayanan, and J. Snoek, “Can you trust your model’s uncertainty? Evaluating predictive uncertainty under dataset shift,” in Advances in Neural Information Processing Systems, vol. 32, Curran Associates, Inc., 2019. [PDF]
- P. Lu, J. Lu, A. Liu, and G. Zhang, “Early concept drift detection via prediction uncertainty,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 18, pp. 19124–19132, Apr. 2025. [PDF]
Uncertainty-Aware Decision Making and Model Adaptation
We integrate uncertainty information into decision-making and model update mechanisms to support risk-aware decisions, sample selection, model correction, and data management. This enables learning systems to maintain robustness and trustworthiness in dynamic and safety-critical environments.
Representative References:
- O. Lockwood and M. Si, “A review of uncertainty for deep reinforcement learning,” in Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, vol. 18, no. 1, pp. 155–162, Oct. 2022. [PDF]
- X. Tang, K. Yang, H. Wang, J. Wu, Y. Qin, W. Yu, and D. Cao, “Prediction-uncertainty-aware decision-making for autonomous vehicles,” IEEE Transactions on Intelligent Vehicles, vol. 7, no. 4, pp. 849–862, 2022. [PDF]
Overall, these methodological advances aim to establish learning models capable of providing reliable confidence estimation and risk identification, forming a critical foundation for trustworthy artificial intelligence in high-risk and mission-critical decision-making scenarios.
Predictive Uncertainty
Source of Uncertainty