My Lab

Statistical Genetics and Bioinformatics Lab

Statistical analysis of DNA methylation data

Statistical analysis of high-dimensional DNA methylation data from 450K or 850K illumina array is very challenging since methylation values at CpG sites in the same gene or genetic region are highly correlated with each other, and the number of CpG sties are much greater than a sample size. I have applied regularization methods to identify CpG sites and their corresponding genes and genetic regions associated with various phenotype outcomes. Also, I have developed some R packages for analysis of high-dimensional DNA methlyation data

Statistical analysis of genetic network data 

In biological network data, functional genes are linked with each other so that they have a network structure such as gene regulatory pathway, metabolic pathway and protein-protein interaction network. Since biological network information has been accumulated over years, I have developed a statistical method that is able to utilize genetic network information into genetic association studies.  Also, I have developed a statistical method based on a Gaussian graphical model to estimate genetic links and to construct an entire genetic network.

Statistical analysis of gene expression data

In analysis of microarray gene expression data, gene set analysis aims to identify gene sets that have either differentially expressed or co-expressed genes between case and control groups. I have applied a covariance thresholding method using regularization to gene set analysis, where I identified some genetic pathway groups including many  differentially co-expressed genes. Also, I have developed statistical methods for  genetic association studies with high-dimensional gene expression data.