Introduction

Bioinformatics CH391L Spring 2013 Final Project

HaeWon Chung


Whenever we analyze the mRNA expression data, we set up the threshold to find differentially expressed genes. However, there are not consensus threshold that can be shared throughout the community. When we present qPCR data, we can claim whatever difference between samples as meaningful data if they're statistically significance. Biologists who are not familiar with quantitative analysis sometimes ask whether there is any biologically meaningful difference that everyone can agree. Since I couldn't answer that question, I will try to find biologically meaningful and unbiased threshold that can identify differentially expressed genes.

In this study, I will try to cluster the genes based on their expression level. By integrating several different data set from the same cell line, I might be able to build robust clusters that most of the genes are involved in the same cluster across the data set. Then, by comparing with real experiments like drug treatment or knock down study, I will try to identify the genes that belong to different cluster in real experiment compared to control and assign them as differentially expressed genes.

                                                          
                                            RT-qPCR                                     Microarray