Data Analysis

This cutting-edge tool will help researchers answer questions regarding "how to fill in missing data of my experimental data sets with a statistical rigor?" or "how to shrink my target populations associated with many features by using only a few representatives (indices)?" Relevant open-source codes and papers information are available in the following links.

Missing Data Curing

[Ultra Data-Oriented Parallel Fractional Hot Deck Imputation (UP-FHDI) ver 1.0]

Ultra data (concurrently big-n and big-p) curing by parallel imputation and variance estimation based on the algorithms of parallel FHDI.

Yicheng Yang, Yonghyun Kwon, Jae-Kwang Kim, and In Ho Cho, 2021. IEEE TKDE (under review).

[Parallel Fractional Hot Deck Imputation (P-FHDI) ver 1.0]

Perform high-performance computing (HPC)-based parallel imputation and variance estimation based on the algorithms of FHDI.

Yicheng Yang, Jae-Kwang Kim, and In Ho Cho, 2020. IEEE TKDE (Accepted; Open-Access available soon).

[R Package FHDI]

Curing multivariate missing data sets using the fully efficient fractional hot deck imputation (FEFI) or the fractional hot deck imputation (FHDI) method

Based on the work of

Im, J., Cho, I., and Kim, J., 2018.FHDI: An R Package for Fractional Hot-Deck Imputation, The R Journal, Vol. 10(1), 140-154. []

Engineering Population Data Squashing

[Numerical Moment Matching Coupled with Generalized Genetic Algorithm]

Window 64-bit program for GA-NMM

Based on the work of

In Ho Cho, Ikkyun Song, and Ya Lu Teng, 2018, Numerical Moment Matching Stabilized by a Genetic Algorithm for Engineering Data Squashing and Fast Uncertainty Quantification, Computers and Structures, 204, 31-47. [].

[Numerical Moment Matching v1]

Matlab code for matching moments of a real-world data column

Based on the work of

In Ho Cho and Keith Porter, 2016. Modeling Building Classes using Moment Matching, Earthquake Spectra, 32(1), 285-301. [doi:]