Software

R code for the MDSeq, a comprehensive solution set for the analysis of gene expression means and variability in large-scale RNA-seq studies. The MDSeq accounts for technical excess zeros, allows for efficient detection of outliers, and provide formal tests of differential gene expressions beyond biologically interesting levels.

Di Ran and Z. John Daye (2017). Gene expression variability and the analysis of large-scale RNA-seq studies with the MDSeq. Nucleic Acids Research. 45 (13): e127. (Link)

AFNC (Download)

R code for adaptive false negative control (AFNC). It provides a means for informative inference of rare variants association of large-scale studies at the single locus level by identifying a modest number of potentially causal variants while avoiding a deluge of noncausal ones. (This package requires a Fortran 90 compiler.)

X. Jessie Jeng, Z. John Daye, Wenbin Lu, Jung-Ying Tzeng (2016). Rare Variants Association Analysis in Large-Scale Sequencing Studies at the Single Locus Level. PLoS Computational Biology. 12(6): e1004993. (Link)

ridle (Download)

Fortran code with R-wrapper for the ridge-lasso hybrid estimator (ridle), a principled sparse regression method for the selection of optional variables while incorporating mandatory ones.

Jing Zhai, Chiu-Hsieh Hsu, and Z. John Daye (2017). Ridle for sparse regression with mandatory covariates with application to the genetic assessment of histologic grades of breast cancer. BMC Medical Research Methodology, 17(1), 12. (Link)

qMSAT (Download)

R code for the quality-weighted multivariate score association test (qMSAT). It allows integration of missing genotypes without the need for imputation and provides conjoined analysis of sequencing data having different qualities and read depths.

Z. John Daye, Hongzhe Li, and Zhi Wei (2012). A Powerful Test for Multiple Rare Variants Association Studies that Incorporates Sequencing Qualities. Nucleic Acids Research. 40 (8): e60. (Link, PDF)

HHR (Download)

Fortran code with R-wrapper of the high-dimensional heteroscedastic regression (HHR) for estimation and variable selection under non-constant error variances. It allows the incorporation of heteroscedasticity arising from predictors explanatory of variability, outliers, and data from varying sources.

Z. John Daye, Jinbo Chen, and Hongzhe Li (2012). High-Dimensional Heteroscedastic Regression with Applications in eQTL Data Analysis. Biometrics. 68 (1): 316-326. (Link, PDF)

sVC (Download)

Fortran code with R-wrapper for the sparse structured shrinkage estimator for estimation and variable selection under nonparametric varying-coefficient (VC) models.

Z. John Daye, Jichen Xie, and Hongzhe Li (2012). A Sparse Structured Shrinkage Estimator for Nonparametric Varying-Coefficient Model with an Application in Genomics. Journal of Computational and Graphical Statistics. 21 (1): 110-133. (Link, PDF)