home

Gregory R. Warnes, Ph.D.

 greg@warnes.net585-678-6661

Computational Biostatistician – Statistical Analyst – Statistical Methods Developer – Computational Scientist

(Download as PDF)

Summary of Qualifications

Skilled biostatistical analyst, statistical methods developer, software developer, and expert in the R statistical analysis system with a broad scientific background, deep technical understanding of advanced biomedical technologies, and solid experience in pre-clinical pharmaceutical and academic research environments. A fascination with biomedical research problems results in collaborative relationships with scientists that extend beyond statistical support to focus on solving the underlying scientific research problems. Excels at communication across scientific, technical, business, and cultural communities, and a demonstrated record of improving efficiency and scientific impact by developing and improving tools and processes. Experience in leadership, organizational development, and management of diverse technical and scientific teams.

Education

Ph.D. Biostatistics

University of Washington, Seattle, WA 

12/2000


Preceptor:

Adrian E. Raftery, Ph.D.

Algorithms, Diagnostics, & Software for Parallel MCMC

 

Biology Advisor:

Brian J. Reid, M.D. Ph.D.

Genetic & Epigenetic changes during progression from Barrett’s Esophagus to Esophageal Adenocarcinoma

M.Sc. Biostatistics

University of Washington, Seattle, WA

12/1997

B.Sc. Statistics

Brigham Young University, Provo, UT 

04/1995

A.Sc. Computer Science

Weber State University, Ogden, UT 

06/1992

Employment History

President

Gregory R. Warnes Consulting

02/2013-Present

Provide modeling expertise, data analysis, and scientific software development for clients in diverse industries, including pharma/biotech, agriculture, and finance. Recent engagements include: Statistical support and guidance for a bioinformatics team in the research organization of a major pharmaceutical company; Development of software for statistically rigorous market prediction by non-quantitative marketing executives for a major pharmaceutical company; Analytic tool development for agricultural crop irrigation planning and management. 

Owner & Chief Scientist

W-Cubed Labs 

02/2013 – Present

Research and development centered on a new high performance multiplexing technique allowing dramatically higher data throughput for wireless, wired, and optical communications systems. Responsibilities include technology development and prototyping, organizational development, and marketing. 

Senior Expert Modeler

Modeling and Simulation, Novartis Institutes for Biomedical Research

07/2011 – 02/2013

Developed an interactive and statistically rigorous web-based tool for assigning clinical study patients and sites to countries, optimizing for time to completion and cost while accounting for country, therapeutic area, indication, study phase, and study length effects on costs, delays, and recruitment rates under resource and other constraints. Clinical trial simulation using PK, PD, patient distribution, and trial feature information for ophthalmology.

Founding Director

Center for Integrated Research Computing, University of Rochester

07/2008 – 07/2011

Founded and managed a comprehensive program supporting computing as a research tool at the University of Rochester. Managed funding, purchase, installation, maintenance, and support for high-performance computing technology. Managed IT and computational science staff. Established center policy and procedures. Developed training and outreach programs for university researchers. Obtained over $10 Million in contributions, gifts, and grant funding. Over a 2 year period, grew the user population from 0 to more than 400 researchers, representing more than $300M in research grants.

Associate Professor

Department of Biostatistics and Computational Biology,

University of Rochester

05/2006 – 07/2011

Co-directed the biocomputing core of the University of Rochester Center for Biodefense Immune Modeling (CBIM). Provided statistical support for immunological research on Influenza and HIV. Developed DEDiscover, a high-quality software application for modeling biological processes using systems of differential equations. Taught courses in applied Bayesian Modeling using Markov Chain Monte Carlo, and Statistical Computing. Performed statistical consulting, statistical research, and computational research in collaboration with faculty from Immunology, Medicine, Biochemistry and Biophysics, Biostatistics, and Computer Science.

Statistical Consultant

Gregory R. Warnes Statistical Consulting

01/2010 – 07/2011

Provided statistical, data analysis, and modeling expertise to clients in medical, legal, and financial organizations.

Owner & Chief Scientist

Random Technologies LLC

12/2006 - 12/2009

Established and operated a software business providing enterprise-class packaging, enhancements, services, and training for the R statistical software system. Performed statistical consulting and software development for scientific, financial, and marketing organizations.

Associate Director

Nonclinical Statistics, Biometrics and Reporting,

Pfizer Global Research and Development

11/2000 - 05/2006

Statistical analysis, methods development, and software implementation in support of projects utilizing "-omic" technologies for pre-clinical, safety, and basic science projects spanning disease areas (Atherosclerosis to Obesity), experimental species (Yeast to Human Subjects), sample sizes (4 to 40,000), and measurement technologies (e.g. small scale DNA genotyping, Affymetrix mRNA microarrays, 2-d gel proteomics, whole-genome SNP genotyping, and NMR metabanomics). Designed and implemented MIDAS, a web-based system for processing mRNA microarray data from machine data to analytic results, quadrupling the number of mRNA supported by each statistician from 10 to 40 per year while improving the reliability, reproducibility, and scientific impact of study results. Developed, published, and maintained more than 20 R, Python, and Zope extension packages.

Associate Research Scientist

Department of Computer Science, Yale University

01/2003 - 05/2006

Developed parallel and distributed algorithms, methods, and software for the analysis of genetic, genomic, proteomic, and metabanomic data, including development of statistical genetics packages for R. Developed, published, and maintained more than 20 R, Python, and Zope extension packages.

Summer Intern

Statistics and Data Mining Research, Bell Labs, Lucent Technologies

06/1999 - 09/1999

Under the direction of John M. Chambers, Ph.D., developed parallel modules for pseudo-random number generation, bootstrapping, and Markov Chain Monte Carlo (MCMC) for the OmegaHat next-generation statistical computing system.

Research Assistant

Division of Public Health Sciences,

Fred Hutchinson Cancer Research Center

12/1997 - 10/2000

Developed and validated algorithms and software for Markov Chain Monte Carlo on parallel and cluster computers. Contributed computational methods descriptions to grant applications, including a successful grant proposing to utilize a Beowulf parallel computing cluster. Constructed, maintained, and supported a 16-node Beowulf parallel computing cluster. Provided scientific programming and Markov Chain Monte Carlo (MCMC) expertise to other project members. Developed the Hydra MCMC library, “MCGibbsit” MCMC diagnostic and mcgibbsit R package.

Computer Skills

Scientific and Mathematical Software

R, S-Plus, SAS, MATLAB, Mathematica, NumPy, PipelinePilot, etc.

Computer Languages

Scripting: Python, PERL, etc. - Compiled: Java, C/C++, Fortran, Pascal, etc.

Formatting/Markup Languages

Latex, HTML/HTTP, XML, etc.

Communications APIs

RMI: SOAP, CORBA, Java RMI - Parallel Computing: MPI, openMP, etc.

Operating Systems

Mac OS X, Linux/Unix, MS-Windows, etc.

Spoken Languages

Native English, Fluent French, Survival German

Selected Publications

Hulin Wu; Hongyu Miao; Warnes, G.R.; Canglin Wu; LeBlanc, A.; Dykes, C.; Demeter, L.M. "DEDiscover: A Computation and Simulation Tool for HIV Viral Fitness Research," BioMedical Engineering and Informatics, 2008. BMEI 2008. International Conference on , vol.1, no., pp.687-694, 27-30 May 2008

Kooner JS, Chambers JC, Aguilar-Salina CA, et al. "Genome-wide scan identifies variation in MLXIPL associated with plasma triglycerides in man,” Nature Genetics, 40, 149 - 151 (2008).

Burrows RB, Warnes GR, Hanumara RC. "Statistical Modeling of Biochemical Pathways,” IET Systems Biology, IET Syst. Biol. 1, 353 (2007)

Mank-Seymour AR, Richmond JL, Wood LS, Reynolds JM, Fan Y, Warnes GR, Milos MP, Thompson JF. "Association of torsades de pointes with novel and known single nucleotide polymorphisms in long QT syndrome genes," American Heart Journal, Volume 152, Issue 6, Pages 1116-1122, August 2006

Warnes GR. “Sample Size Estimation for Microarray Experiments using the SSIZE package,” R News, Volume 6, Issue 5, pp. 64-68, December 2006.

Caba E, Dickinson DA, Warnes GR, Aubrecht J.Differentiating mechanisms of toxicity using global gene expression analysis in Saccharomyces cerevisiae,” Special Issue on EEMS 2004, Mutation Research, Volume 575, Issues 1-2, Aug. 2005, Pages 34-46.

Selected Software

DEDiscover: High-quality software application that enables biologists and physicians to model biological processes using systems of ordinary and delay differential equations. - https://cbim.urmc.rochester.edu/software/dediscover

MiDAS: Web-based system based on the scientific workflow that allows scientists, lab staff, and statisticians to store, manage, process, and analyze mRNA microarray data using best-available statistical, graphical, and computational technologies based on R, Zope, and RStatServer

RStatServer: System for quickly developing and deploying web applications that integrate statistical computations and graphics using R. Includes Python, R & Zope packages RSessionDA, Rpy, RSOAP, SOAPpy, and fpconst.

R Packages: General tools: gregmisc, gtools, gdata, gmodels, gplots, namespace, SASxport, sessionStatistical genetics & genomics: genetics, GeneticsBase, GeneticsDesign, GeneticsPed, GeneticsQC, fbat, ssize, qvalue - Parallel computing: fork, Rlsf - Audiology: SII - MCMC Diagnostics: mcgibbsit.


Publications

Hulin Wu; Hongyu Miao; Warnes, G.R.; Canglin Wu; LeBlanc, A.; Dykes, C.; Demeter, L.M. "DEDiscover: A Computation and Simulation Tool for HIV Viral Fitness Research," BioMedical Engineering and Informatics, 2008. BMEI 2008. International Conference on , vol.1, no., pp.687-694, 27-30 May 2008 http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4548758&isnumber=4548615

Kooner JS, Chambers JC, Aguilar-Salina CA, et al. “Genome-wide scan identifies variation in MLXIPL associated with plasma triglycerides in man”, Nature Genetics, 40, 149 - 151 (2008). http://www.nature.com/ng/journal/v40/n2/full/ng.2007.61.html

Burrows RB, Warnes GR, Hanumara RC. “Statistical Modeling of Biochemical Pathways”, IET Systems Biology, IET Syst. Biol. 1, 353 (2007) http://www.sciencedirect.com/science/article/B6T2C-4C4C52R-1/2/b405d51f9c53f02c012b72296796927a

Mank-Seymour AR, Richmond JL, Wood LS, Reynolds JM, Fan Y, Warnes GR, Milos MP, Thompson JF. “Association of torsades de pointes with novel and known single nucleotide polymorphisms in long QT syndrome genes.” American Heart Journal, Volume 152, Issue 6, Pages 1116-1122, August 2006 http://www.mdconsult.com/das/article/body/214469302-2/jorg=journal&source=&sp=16680575&sid=0/N/561766/s0002870306007605.pdf?issn=0002-8703

Warnes GR. “Sample Size Estimation for Microarray Experiments using the SSIZE package”, R News, Volume 6, Issue 5, pp. 64-68, December 2006. http://cran.r-project.org/doc/Rnews/Rnews_2006-5.pdf

Warnes GR, Jain N. “Balloonplot: A graphical tool for displaying tabular data”, R News, Volume 6, Issue 2, May 2006. http://cran.r-project.org/doc/Rnews/Rnews_2006-2.pdf

Warnes GR, Liu P. “Sample Size Estimation for Microarray Experiments”, Technical report 06/06, Department of Biostatistics and Computational Biology, University of Rochester, 2006. (also submitted to Bioinformatics) http://www.urmc.rochester.edu/biostat/people/faculty/documents/0606_warnes_liu.pdf

Caba E, Dickinson DA, Warnes GR, Aubrecht J. “Differentiating mechanisms of toxicity using global gene expression analysis in Saccharomyces cerevisiae,” Special Issue on EEMS 2004, Mutation Research, Volume 575, Issues 1-2, Aug. 2005, Pages 34-46. http://www.sciencedirect.com/science/article/B6T2C-4G3CX6K-1/2/b86f0d9d41a37bd2a6e805c423b4f32f

Warnes GR. “RSOAP - Using “R” with Python,” PyZine, Volume 11, Issue 05, Apr. 2004. http://www.pyzine.com/Issue005/Section_Articles/article_RSOAP.html

Dickinson DA, Warnes GR, Quievryn G, Messer J, Zhitkovich A, Rubitski E, and Jiri A. “Differentiation of DNA-reactive and non-reactive genotoxic mechanisms using gene expression profile analysis”, Mutation Research, Volume 549, Issues 1-2, May 2004, Pages 29-41. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.78.4690&rep=rep1&type=pdf#page=35

Warnes GR. “The Genetics Package,” R News, Volume 3, Issue 1, Jun. 2003. http://cran.r-project.org/doc/Rnews/Rnews_2003-1.pdf

Warnes GR. “The Gregmisc Package: Something for Everyone” submitted to R News. http://www.warnes.net/Research/Publications/gregmisc.pdf

Warnes GR. “HYDRA: A Java library for Markov Chain Monte Carlo,” Journal of Statistical Software, Volume 7, Issue 4, Mar. 2002. http://www.jstatsoft.org/v07/i04/

Warnes GR. “The Normal Kernel Coupler: An adaptive Markov Chain Monte Carlo method for efficiently sampling from multi-modal distributions,” submitted to Journal of the American Statistical Association, currently in revision.

Yanez ND, Warnes GR, and Kronmal RA. “A Univariate Measurement Error Model for Longitudinal Change,” Communications in Statistics, Volume 30, Issue 2, 2001. http://dx.doi.org/10.1081/STA-100002031

Warnes GR. “The Normal Kernel Coupler: An adaptive Markov Chain Monte Carlo method for efficiently sampling from multi-modal distributions,” Technical Report no. 395, Department of Statistics, University of Washington, Apr. 2001. http://www.stat.washington.edu/research/reports/2001/tr395.pdf

Warnes GR. “HYDRA: A Java library for Markov Chain Monte Carlo,” Technical Report no. 394, Department of Statistics, University of Washington, Apr. 2001. http://www.stat.washington.edu/research/reports/2001/tr394.pdf

Warnes GR. “The Normal Kernel Coupler: An adaptive Markov Chain Monte Carlo method for efficiently sampling from multi-modal distributions,” Ph.D. thesis, Department of Biostatistics, University of Washington, Oct. 2000. http://www.warnes.net/Research/Publications/Warnes_Thesis.


Published Software

Warnes GR. “SII” This R package calculates ANSI S3.5-1997 Speech Intelligibility Index (SII), a standard method for computing the intelligibility of speech from acoustical measurements of speech, noise, and hearing thresholds. http://cran.r-project.org/package=SII, 2010-

Warnes GR.SASxport”: This R package provides functions for reading, listing the contents of, and writing SAS xport format files, fully supporting translation between R “factor” objects and SAS FORMAT labels. http://cran.r-project.org/package=SASxport, 2007-

Warnes GR, Miao H, Wu C, LeBlanc, Wu H. “DEDiscover” a cross-platform software tool for differential equation model simulation and estimation, designed with special attention to the features necessary for modeling the interaction between the human immune system and viruses. https://cbim.urmc.rochester.edu/software/dediscover, 2007-

Qiu W, Lazarus R, Warnes GR, Jain N.“GeneticsQC”, a package of classes and functions for the open-source statistical package R that checks the quality of a genetics data set (as a geneSet object), such as reporting the counts of missing genotypes for markers or for subjects, checking Mendelian errors, testing Hardy-Weinberg Equilibrium (HWE), etc. This package also provides functions to filter out low-quality markers, subjects, and/or families. http://r-genetics.org, 2007-

Warnes GR, Duffy D, Man M, Qiu W, Lazarus R. “GeneticsDesign”, a package for the open-source statistical package R that provides classes and functions for designing genetics studies, including power and sample-size calculations. http://bioconductor.org/packages/2.6/bioc/html/GeneticsDesign.html, 2007-

Qiu W, Lazarus R, Warnes GR, Jain N. “fbat”, a package for the open-source statistical software package R that implements a broad class of Family Based Association Tests for genetics data, with adjustments for population admixture using the code from the 'FBAT' software program. http://bioconductor.org/packages/2.6/bioc/html/fbat.html, 2006-

Warnes GR, Lazarus R, Chasalow SD, Montana G, O'Connel M, Cheng J, Jain N. “GeneticsBase”, a package for the open-source statistical package R that provides classes and functions for handling and analyzing large scale genetic data (up to 1e6 markers) http://bioconductor.org/packages/2.6/bioc/html/GeneticsBase.html, 2005

Warnes GR. “WardListing”, a tool for downloading LDS Ward Membership information from the official LDS web site, and translating it into formats appropriate for importing into address book software. http://www.warnes.net/Software/WardListing, 2005-

Smith CR and Warnes GR. “Rlsf”, a package of functions for the open-source statistical package R that provides functions for using R with the LSF cluster/grid queuing system. http://cran.r-project.org/package=Rlsf, 2005-

Warnes GR and Li F. “ssize”, a package of functions for the open-source statistical package R that provides functions for computing and displaying sample size information for gene expression arrays. http://bioconductor.org/packages/2.6/bioc/html/ssize.html, 2004-

Warnes GR. “gmodels”, a package of functions for the open-source statistical package R that provides various R programming tools for model fitting. http://cran.r-project.org/package=gplots, 2005

Warnes GR. “gdata”, a package of functions for the open-source statistical package R that provides various R programming tools for data manipulation. http://cran.r-project.org/package=gplots, 2005-

Warnes GR. “gtools”, a package of functions for the open-source statistical package R that provides various general purpose programming tools. http://cran.r-project.org/package=gtools, 2005

Warnes GR. “gplots”, a package of functions for the open-source statistical package R that provides various R programming tools for plotting data. http://cran.r-project.org/package=gplots, 2005-

Moriera W, Warnes GR. “rpy”, a robust Python, interface to the R Programming Language, http://rpy.sf.net, 2004-

Warnes GR. “fork”, a package of functions for the open-source statistical package R that provide simple wrappers around the Unix process management API calls: fork, wait, waitpid, kill, and _exit. This enables construction of R programs that utilize multiple concurrent processes. http://cran.r-project.org/package=fork, 2003-

Warnes GR. “fpconst”, a Python library providing constants and functions for creating and detecting IEEE 754 the floating point special values. http://research.warnes.net/projects/RStatServer/fpconst, 2003-

Warnes GR. “CSVFile”, objects for the open-source web application development system Zope which automatically detecting and translating Microsoft Excel files into comma-delimited text files when uploaded. http://research.warnes.net/projects/RStatServer/csvfile/, 2003-

Warnes GR. “RSessionDA”, a Python module to allow Zope access to the features of the open-source statistical package/language R. http://research.warnes.net/projects/RStatServer/rsessionda/, 2003-

Warnes GR. “session”, a package of function for open-source statistical package R that permit the state of the R session to be saved and restored, as well as functions for capturing the result of evaluating strings containing R commands and capturing the output. http://cran.r-project.org/package=session, 2002-

Warnes GR. “RSOAP”, a server providing access to the features of the open-source statistical package R via the SOAP protocol. http://research.warnes.net/projects/RStatServer/rsoap/, 2002-

Ullman C, Matthews B, Warnes GR, Blunk C. “SOAPpy”, a SOAP implementation for Python. http://pywebsvcs.sourceforge.net/, 2002-2004

Warnes GR and Leisch F. “genetics”, a package for handling marker-based genetic data within the open-source statistical package R. The package includes function to compute allele frequencies, use genetic markers in statistical models, estimate disequilibrium, and test for departure from Hardy-Weinberg equilibrium. http://cran.r-project.org/package=genetics, 2002-

Warnes GR et al. gregmisc”, a package of useful utility functions for the open-source statistical package R. Most functions in the gregmisc library fall into five general areas: permutations and combinations, tools for linear models, plots, data manipulation, and fixed or extended versions of existing functions. http://cran.r-project.org/package=gregmisc, 2001-

Warnes GR. “DistLib”, an Java library containing classes for computing features of and generating random numbers from a variety of statistical distribution functions. http://statdistlib.sourceforge.net/, 2000-

Warnes GR. “mcgibbsit”, a package for the open-source statistical package R implementing the MCGIBBSIT diagnostic software for multiple (potentially interrelated) MCMC samplers. http://cran.r-project.org/package=mcgibbsit, 2000-

Warnes GR. “HYDRA”, an open-source, platform-neutral library for performing Markov Chain Monte Carlo. It implements the logic of standard MCMC samplers within a framework designed to be easy to use and to extend while allowing integration with other software tools. http://research.warnes.net/projects/mcmc/hydra, 2000-

Warnes GR. “ClusterNFS”, an NFS server that allows diskless NFS clients to share a common file system by providing for host, user, and group-specific files within the same directory structure via interpreted file name extensions. http://ClusterNFS.sourceforge.net, 1999-