Semi-automated genome annotation and an expanded epigenetic alphabet - Dr. Michael Hoffman
Post date: Jan 21, 2016 8:08:37 PM
Where: Ryerson University, George Vari Engineering and Computing Centre, Room LG04 (please check before the seminar)
When: 1-2pm, February 11, 2016
Contact: IEEE Signals and Computational Intelligence Joint Chapter. Dr. Lorenzo Livi (llivi@scs.ryerson.ca)
Speaker: Dr. Michael Hoffman (http://hoffmanlab.org/)
Title: Semi-automated genome annotation and an expanded epigenetic alphabet
Abstract: First, we will discuss Segway, an integrative method to identify patterns from multiple functional genomics experiments, discovering joint patterns across different assay types. We apply Segway to ENCODE ChIP-seq andDNase-seq data and identify patterns associated with transcription start sites, gene ends, enhancers, CTCF elements, and repressed regions. Segway yields a model which elucidates the relationship between assay observations andfunctional elements in the genome.
Second, we will discuss a new method to discover transcription factor motifs and identify transcription factor binding sites in DNA with covalent modifications such as methylation. Just as transcription factors distinguish one standard nucleobase from another, they also distinguish unmodified and modified bases. To represent the modified bases in a sequence, we replace cytosine (C) with symbols for 5-methylcytosine (5mC), 5-hydroxylmethylcytosine (5hmC), 5-formylcytosine (5fC). Similarly, we adapted the well-established position weight matrix model of transcription factor binding affinity to an expanded alphabet. We created an expanded-alphabet genome sequence using genome-wide maps of 5mC, 5hmC, and 5fC in mouse embryonic stem cells. Using this sequence and expanded-alphabet position weight matrixes, we reproduced various known methylation binding preferences, including the preference of ZFP57 and C/EBPβ for methylated motifs and the preference of c-Myc for unmethylated motifs. Using these known binding preferences to tune model parameters enables discovery of novel modified motifs.
Bio: Michael Hoffman is a principal investigator at the Princess Margaret Cancer Centre and Assistant Professor in the Departments of Medical Biophysics and Computer Science, University of Toronto. He researches the application of machine learning techniques to epigenomic data. He previously led the National Institutes of Health ENCODE Project's large-scale integration task group while at the University of Washington. He has a PhD from the University of Cambridge, where he conducted computational genomics studies at the European Bioinformatics Institute. He also has a B.S. in Biochemistry and a B.A. in the Plan II Honors Program at The University of Texas at Austin. He was named a Genome Technology Young Investigator and has received several awards for his academic work, including a NIH K99/R00 Pathway to Independence Award.