This paper introduces the nested heteroscedastic Gaussian process approach (NHGP) to tackle simulation metamodeling with large-scale heteroscedastic datasets. NHGP achieves scalability by aggregating sub-stochastic kriging (SK) models built on disjoint subsets of a large-scale dataset, making it user-friendly for SK users. We show that the NHGP predictor possesses desirable statistical properties, including being the best linear unbiased predictor among those built by aggregating sub-SK models and being consistent. The numerical experiments demonstrate the competitive performance of NHGP.
We provide convergence analyses of prediction errors for stochastic kriging (SK) under two scenarios. First, we examine the case where the kernel smoothness is potentially misspecified and establish the determined convergence rate of the mean square error. We demonstrate that the optimal rate is achieved when the kernel is smoother than the true function, while the worst rate is primarily influenced by the order of the noise variance. Second, we analyze the high-dimensional setting and provide the high probability rate of the error with L∞ norm. Our analysis shows that the worst convergence rate, influenced by noise variance, can be improved through experimental design, while in high-dimensional cases, we reduce the impact of dimensionality on the rate.
Jin Zhao and Xi Chen, "Nested Heteroscedastic Gaussian Process for Simulation Metamodeling," Proceedings of the 2024 Winter Simulation Conference, 419-430.
Jin Zhao and Xi Chen, "Convergence Analysis of Stochastic Kriging Predictor Under Possible Kernel Misspecification," Manuscript in preparation.
Accurate estimation of the underlying mean functions from data subject to heteroscedastic noise poses significant challenges in statistical modeling and surrogate modeling applications. Traditional Gaussian process (GP) methods addressing heteroscedasticity, such as stochastic kriging, often rely on stringent assumptions and restrict flexibility in hyperparameter tuning. Motivated by these limitations, this paper proposes a novel heteroscedastic precision-weighted kernel ridge regression (HPWR) framework, explicitly integrating input-dependent noise variances through precision-based weighting. By establishing theoretical equivalence between GP regression and kernel ridge regression (KRR), rigorous convergence rates of the HPWR estimator under various kernel eigenvalue decay conditions are derived. Optimal convergence rates, contingent on appropriate choices of the regularization parameter, are explicitly characterized. Extensive numerical experiments validate the theoretical findings, demonstrating improved predictive accuracy and robustness of HPWR compared to existing approaches. These results highlight the critical importance of adaptively selecting regularization parameters based on data-driven insights into noise heterogeneity and kernel properties.
This work investigates the use of conditional kernel mean embeddings (CKME) to estimate conditional cumulative distribution functions, the application of proper scoring rules—such as the continuous ranked probability score (CRPS)—for learning and hyperparameter tuning, and the resulting properties of CKME-based conformal prediction intervals for random outputs from stochastic simulation models.
Jin Zhao and Xi Chen, "Heteroscedastic Weighted Kernel Ridge Regression for Simulation Metamodeling," in preparation.
Jin Zhao and Xi Chen, "Conditional Kernel Mean Embedding-based Conditional CDF Estimation and Conformal Prediction," Manuscript in preparation.