A schematic diagram showing the workflow for RNA sequencing. The figure was adapted from Liang and Pardee, 2003, [Created with BioRender.com]
Differential Gene Expression
The technique used to study gene expression levels between two or more different conditions or experimental groups is called differential gene expression (DEG) analysis. RNA sequencing (RNA-seq) has revolutionized this process by enabling comprehensive transcriptome profiling. In RNA-seq, the entire complement of RNA molecules within a sample is sequenced, providing quantitative data on gene expression. Advanced algorithms are then employed for in-depth analysis of RNA-seq data, identifying statistically significant gene upregulation or downregulation between experimental conditions, thereby elucidating complex molecular mechanisms underlying biological processes or diseases.
Triple Negative Breast Cancer (TNBC)
Differences between a triple negative breast cancer cell and an ordinary breast cancer cell; adapted from Maryam et al., 2023
Different processes involved in tumour progression. Genes involved in these processes are most likely to be up or downregulated in tumor condition. Adapted from Dennis Mazingi and Kokila Lakhoo, 2023.
Triple-negative breast cancer (TNBC) is a particularly aggressive subtype of breast cancer characterized by the absence of three hormone receptors typically found in breast cancer cells: estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2). As a result, TNBC does not respond to hormone-based therapies or drugs targeting HER2, which are effective treatments for other types of breast cancer. TNBC accounts for 15-20% of all breast cancer cases worldwide, which makes studying TNBC imperative, given its status as a formidable clinical obstacle with scarce treatment avenues. Differential gene expression analysis can help unveil the molecular intricacies of TNBC, thereby paving the way for targeted therapeutic breakthroughs.
Project Overview
The overarching objective of this project is to analyze RNA-Seq data derived from TNBC patients, focusing on identifying genes exhibiting differential expression between tumor and normal samples. By employing differential gene expression analysis, we aim to pinpoint specific genes upregulated or downregulated in tumor samples compared to their normal counterparts. Subsequently, we'd like to know more about the functional implications of these gene expression alterations, particularly in cellular metabolism, and explore their potential role in TNBC pathogenesis. This approach provides valuable insights into the molecular mechanisms underlying TNBC development and progression.
Data Collection
For this study, RNA seq data was obtained from the Gene Expression Omnibus (GEO) repository (accession number: GSE183947), and the data was published in 2022 by Zhang et al. In this particular study, they collected 30 pairs of cancerous and normal tissue samples from breast cancer patients undergoing mastectomy. Tissue samples were verified by pathology examination and used for subsequent RNA sequencing after obtaining written consent. RNA-seq was conducted on an Illumina Novaseq™ 6,000 system, and data quality control and genome alignment were performed. Finally, mRNA expression levels were quantified using RNA-Seq by Expectation Maximization and normalized to Fragments Per kilobase Per Million reads) (Li and Dewey, 2011) for further analysis.