Yuzhe Ni:

Reimplementation of DeepCpG

Contact Info:

email: yni012@ucr.edu

phone number: 732-668-1748

Research Interests: Algorithm, Deep Learning, Data Mining, Data Visualization

Week 1-2: I have read the paper "DeepCpG: accurate prediction of single-cell DNA methylation states using deep learning" and I have a deep understanding over DNA methylation state. I will try to formulate the problem well in the week 3-4. And I will further read other papers related to the topic cited in the reference.

Week 3-4: I have read the paper "F. Single-cell methylome landscapes of mouse embryonic stem cells and early embryos analyzed using reduced representation bisulfite sequencing." written by Guo H. Zhu Et al. I have a good understanding of methylation and other terms related to my project research.

Before we define what methylation is, we need to talk about epigenomes. Genes in two identical-gene twins are not the only factor for them to have their traits. However, some of these traits can be different. Why is this the case? It is because of the epigenomes that interact with the molecule of their bodies. Different interactions will lead to different habits or traits.

Now, methylation is relating to if some of these genes are turning on or off, specifically it will turn off or on a gene by introducing or removing the methyl group. The most significant genetic pattern relates to the methylation is the CpG (5'—C—phosphate—G—3'). By inspecting these sites, we will be able to understand how and when a gene is turned off.

The paper written by Guo et al introduces a method called scRRBS (single-cell Reduced Representation Bisulfire Sequencing). The novelty of the method lies in it being the first method that is able to do DNA methylation analysis within a single cell resolution. The paper also introduces some computational ways in verifying their analysis by using confusion matrix and hierarchical clustering to detect the level of DNA methylation within a single cell.


Week 5-6: I have implemented different benchmarks for testing and comparing DeepCpG. Specifically, I will use HMM and SVM to model the sequencing.