1. Genomic data analysis and statistical model construction
My work at Yale University focused on the analysis of genome-wide exon-level transcriptome data from human brains. The dataset was generated from 1,340 tissue samples collected from one or both hemispheres of 57 postmortem human brains, spanning from embryonic stage to late adulthood and representing males and females of multiple ethnicities. 16 brain regions comprising the cerebellar cortex, mediodorsal nucleus of the thalamus, striatum, amygdala, hippocampus, and 11 areas of the neocortex were investigated. Genome-wide genotyping of 2.5 million SNPs and assessed copy number variations for all donors were also performed. Using statistical analysis, we found 90% of expressed genes were differentially regulated at the whole transcript or exon level across regions and/or time. The majority of these spatiotemporal differences occurred before birth, followed by an increase in the similarity among regional transcriptomes during postnatal lifespan. We identified sex differences in gene expression and exon usage. We also demonstrate how these results can be used to profile trajectories of genes associated with neurodevelopmental processes, cell types, neurotransmitter systems, autism, and schizophrenia, as well as to discover associations between SNPs and spatiotemporal gene expression. This study provides a comprehensive, publicly available dataset on the spatiotemporal human brain transcriptome and new insights into the transcriptional foundations of human neurodevelopment. This co-first author paper has been accepted by Nature (Kang, Kawasawa, Cheng et al. and Sestan, Nature, 2011).
My dissertation study at the University of Virginia focuses on microarray data analysis and statistical prediction model construction. The purpose is to identify reliable genetic biomarkers for drug efficacy or toxicity prediction and disease diagnosis. There are three clinical foci:
(1) Lung cancer chemotherapeutic drug efficacy prediction. Our group at the University of Virginia has devised an algorithm called COXEN (www.coxen.org) which can predict the effectiveness of a chemotherapeutic drug for a specific cancer. I performed the COXEN method to predict lung cancer patient responses to chemotherapy based on their microarray data. In addition, I applied this method to anti-cancer drug discovery. These projects, considered to be pioneering work in personal therapy, were supported by two NIH R01 fundings.
(2) Drug hepatotoxicity prediction. We combined gene microarray data and a traditional clinical biomarker, alanine aminotransferase (ALT), together to identify genes associated with liver injuries. We also constructed a multivariate statistical model based on these genes to predict the liver toxicity of chemical compounds. This work has been submitted to Journal of Theoretical Biology. This prediction model has been evaluated and marketed by University of Virginia Patent Foundation.
(3) Diabetes and atherosclerosis diagnosis. In studies of microarray data from patients, we identified some genes whose expression shows high correlation with the level of risk factors for these two diseases. Based on these disease-related genes, we developed a computational model. Prediction results might assess the risk of getting diabetes and atherosclerosis for a patient and can alert people to take preventive measures prior to organ dysfunction or damage.
2. Computational Medicinal Chemistry
My research in medicinal chemistry focused on the discovery of novel therapeutic chemical compounds by using computational methods including molecular docking, 3D-QSAR, pharmacophore mapping, quantum chemistry and molecular dynamics.
At the Chinese Academy of Science (CAS), I focused on two important drug targets, Matrix Metalloproteinase (MMP) and PPARγ, both for the treatment of cancer and type II diabetes. I successfully applied quantum chemistry methods to calculate the binding affinities of MMP and its inhibitors (Cheng & Jiang et al. 2002 J. Phy. Chem. B). This led to identification of four new potent inhibitors with binding activity at nanomole concentrations. For PPARγ, I performed large-scale virtual screening for more than 2.5 million compounds by using a parallel molecular docking program on a 396 CPU supercomputer. From the results, we purchased 150 compounds. Bioassays showed that more than 70 of the compounds are potent PPARγ agonists, and the activities of three compounds are higher than the marketed drug Troglitazone.
Moreover, I investigated the drug mechanism by simulating the interactions between the drug and its receptor. At CAS, I studied the mechanism of artemisinin, a famous anti-malarial drug used in traditional Chinese medicine, by detecting the interactions between this dug and its receptor, hemin (Cheng & Jiang et al. 2002 Bioorg. Med. Chem.). Additionally, I simulated the conformation change of gelsolin using steered molecular dynamics and gained new insights into the activation mechanism of this important cell apoptosis protein (Cheng & Jiang et al. 2002 Biophys. J.).
During my post-doc training at the University of Illinois at Urbana-Champaign, I investigated the docking of a variety of inhibitors and substrates to isoprene biosynthesis pathway enzymes such as FPPS, IPPI and DXR. The results show for the first time that the geometries of a broad variety of phosphorus-containing inhibitors can be well predicted by using computational methods. It may facilitate the design of novel inhibitors of these enzymes (Cheng & Oldfield. 2004 J. Med. Chem.).
At RICE University, I performed a molecular surface analysis and docking study to characterize the molecular interactions between human fatty acid synthase (hFAS) and its various ligands. hFAS is an attractive drug target to treat obesity and cancer. Docking of palmitate, the main biological product of hFAS, into this pocket revealed the catalytic mechanism of this enzyme. Docking of two known hFAS inhibitors revealed the pharmacophore of these drugs. These results provided useful clues for structure-based drug design against this important target (Cheng & Ma et al. 2008 Proteins).
3. Structural biology
At the University of Illinois at Urbana-Champaign, I used NMR spectroscopy and X-ray Crystallography in conjunction with quantum chemistry and other molecular modeling methods to identify the relationship between C13 NMR chemical shifts and structures of histidine residue in proteins. We also investigated the nature of these hydrogen bonding interactions using atoms-in-molecules (AIM) theory. These results opened up a new way to analyze tautomer states and the hydrogen bond properties of histidine residues in proteins (Cheng & Oldfield, 2005 J. Am. Chem. Soc.).
At RICE University, I solved the crystal structure of hemagglutinin from influenza B/Hong Kong/8/73 (B/HK) virus at 2.8 Å resolution. This structure provides a framework for a detailed understanding of antigenic variation of influenza B virus. Moreover, this structure reveals the molecular basis for the pH dependence and sensitivity to ionic strength of influenza B hemagglutinin. This information might be helpful for the design of novel anti-influenza drugs, especially for bird flu (Wang and Cheng et al., 2008 J. Virology).