limma is a Bioconductor package that provides analysis for microarray or RNA-seq data. In this project, limma is used to calculate the differential gene expression in microarray data (downloaded from GEO) and compare numbers of differentially expressed genes based on time course.
On the GSEA offical website (https://www.gsea-msigdb.org/gsea/index.jsp), GSEA is defined as "a computational method that determines whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states (e.g. phenotypes)." Diseases or phenotypes often involve in the changes of multiple genes. GSEA has the advantage of looking at the enrichment of all the genes in a specific pathway and avoid the problem caused by undetectable changes in single genes. GSEA is used as the main tool to identify highly perturbed pathways from control and treated groups of cells in this project.
pathview is a Bioconductor package and is one of the “major tool sets for intuitive pathway-based data integration and visulization”. It adheres to human readable pathway definiations and layouts like KEGG and may also integrate with gene set (enrichment) analysis tools to map and render user data on relevant pathway graphs.
Cytoscape is a software platform that allows users to analyze and visualize large, complex datasets. Some of its functions include building molecular interaction networks and integrating gene expression data. In this project, Cytoscape is used to visualize relationships of different biological pathways identified in the GSEA and reveal higher level biological processes involved.
Gene Expression Omnibus (GEO) is a public database repository that stores and distributes microarray, next-generation sequencing, and other high-throughput genomics data. In this project, we found two independent studies that contain microarray experiments of cells treated with platinum-based drugs. We downloaded the raw data from GEO and processed the data to suit for our downstream analyses.
KEGG (Kyoto Encyclopedia of Genes and Genomes) is an integrated database that has been developed by Kanehisa Laboratories since 1995 in Kyoto University under the Japanese Human Genome Program. KEGG is considered to be a "computer representation" of the biological system. KEGG database interprets and integrates large-scale molecular data with prominent knowledge of genomes, biological pathways, drugs, diseases, and chemical substances. The database entry of each database is called the KEGG object, and KEGG PATHWAY database is mainly used in this project.
KEGG PATHWAY database (https://www.genome.jp/kegg/kegg3a.html) is a collection of manually drawn KEGG pathway maps representing experimental knowledge on metabolism, gene regulation, and signal transduction, etc. KEGG pathway mapping analysis utilizes networks of molecular interactions and reactions provided by KEGG PATHWAY to compare the gene content in the genome and thus identify pathways of interests.
The Gene Ontology Biological Process is a series of molecular events. The database stores both high level biological process and their corresponding low level biological pathways. In this project, GOBP is referred to analyze enriched biological pathways in GSEA and to identify higher level biological process in Cytoscape.