Figure 1: The figure on the right depicts a Density Plot illustrating the distribution of gene expression values across normal and tumor conditions. The plot provides a visual representation of the density of gene expression levels in relation to the log2 fold change. Positive log2 fold change values denote upregulated genes in tumor samples, while negative values indicate downregulated genes. The plot demonstrates the presence of both upregulated and downregulated genes in tumor samples (teal) compared to normal samples (yellow).
Table 1: The above table displays the number total number of genes that are differentially expressed between normal and tumor conditions, and the number of genes that were upregulated and downregulated in tumor patients.
Figure 2: The figure here depicts a scatter plot representing the relationship between gene expression levels in normal and tumor conditions. Each point on the scatter plot corresponds to a specific gene, with its position determined by its expression level in the normal condition plotted against its expression level in the tumor condition. Genes with similar expression levels in both conditions cluster around the diagonal line (representing no change), while differentially expressed genes deviate from this line. Upregulated genes appear above the diagonal (KRT1, KRT2, KRT79, PM20D1, DES), while downregulated genes appear below it (ENT1, CXCL10, PDCSP, MYBL2).
Figure 3: A volcano plot relates the statistical significance (P value) of the difference in gene expression with the magnitude of fold change. They are helpful for quickly identifying genes that are both significantly differentially expressed and have large-fold changes between experimental conditions. In a volcano plot, the most upregulated genes found from our analysis are annotated towards the right (KRT1, KRT2, KRT79, PM20D1, DES), and the most downregulated genes are annotated towards the left (ENT1, CXCL10, PDCSP, MYBL2).
In a typical heat map, each row represents a gene, and each column represents a sample. The color intensity of each square in the grid corresponds to the expression level of the gene in the corresponding sample according to the color gradient given, with brighter colors indicating higher expression and darker colors indicating lower expression. They are useful in identifying patterns and trends in gene expression data, such as clusters of genes with similar expression patterns or samples with similar gene expression profiles.
Figure 4a: Heat map representing the complete set of differentially expressed genes
Figure 4b: Heat map representing the differentially expressed genes that are specifically upregulated in the tumor cells (diseased state) as compared to the normal cells (healthy state)
Figure 4c: Heat map representing the differentially expressed genes that are specifically upregulated in the normal cells (healthy state) as compared to the tumor cells (diseased state)
Figure 4d: Heatmap representing the top 4 upregulated and downregulated genes from the analysis.
The bar plots below represent the expression levels of each particular gene across the normal and tumor conditions, with the height of the bar indicating the magnitude of expression.
Figure 5a: This bar plot depicts the genes found to be significantly upregulated in tumor cells.
Figure 5b: This bar plot depicts the genes found to be significantly downregulated in tumor cells.
Gene ontology (GO) enrichment analysis is a powerful tool for interpreting the biological significance of sets of genes derived from high-throughput experiments like RNA sequencing (RNA-seq). By identifying overrepresented GO terms within gene sets, it provides insights into gene function and biological processes.
Figure 6a: This figure highlights the top 10 Enriched GO Processes for Upregulated Genes (Based on adjusted p-value)
Figure 6b: This figure highlights the top 10 Enriched GO Processes for Downregulated Genes (Based on adjusted p-value)
PCA plots provide a powerful way to explore and visualize complex gene expression data, aiding in the identification of patterns and differences between biological samples that can guide subsequent analyses, such as differential gene expression analysis. PCA does not discard any samples or characteristics (variables). Instead, it reduces the overwhelming number of dimensions by constructing principal components (PCs). PCs describe variation and account for the varied influences of the original characteristics. Such influences, or loadings, can be traced back from the PCA plot to find out what produces the differences among clusters. Samples that cluster together are more similar in terms of gene expression profiles while those that cluster apart are differentially expressed.
Figure 7: The PCA plot visually depicts gene expression profiles of normal (N) and tumor (T) cells. Samples from each group form distinct clusters, indicating substantial differences in gene expression patterns associated with cellular state. This analysis highlights the biological divergence between normal and tumor cells, facilitating further investigation into the molecular mechanisms underlying tumorigenesis.
Specific genes that we found significantly upregulated in our studies include KRT79, KRT1, and KRT2, along with PM20D1 and DES. Keratins (KRTs) are the intermediate filament-forming proteins of epithelial cells and are known to play a critical role in several aspects of cancer pathophysiology. Studies from Takan et al., Han et al., and Kim et al. have highlighted that alterations in keratin expression correlate with cancer invasion, metastasis, and epithelial-mesenchymal transition, or EMT. All of these factors are known to be primarily responsible for the aggression and metastasis of breast cancer. DES, a gene that encodes for desmin, is involved in maintaining the structural integrity and function of muscle fibers which are often differentially expressed in smooth muscle tumors or mesenchymal-origin tumors. PM20D1 (Peptidase M20 Domain-Containing 1) has only been highlighted in the last few years and changes in its methylation and expression levels have been reported to be associated with several disease phenotypes. A direct link TNBC is yet to be established.
Some genes that we highlighted as downregulated in tumor cells, substantially as compared to healthy cells include MYBL2 , EN1, FDCSP and CXCL10. FDCSP supports immune cell interactions, particularly with T follicular helper cells (TFHs) and chemokine pathways critical for immune cell recruitment. In cancer, FDCSP downregulation can disrupt these interactions, leading to impaired immune surveillance and reduced infiltration of cytotoxic T cells into the tumor microenvironment; however, a direct link to its upregulation in TNBC has not been reported elsewhere. MYBL2 and EN1 are known to be a central regulator in progression of cancers such as bladder cancer, prostate cancer etc. the downregulation of such a downregulator is striking and can indicate towards a more complicated gene regulation in case of TNBC cells .We also found a bunch of HIST1 family genes downregulation, which can be correlated with the role of genomic stability and histone post-translational modifications involved in tumor suppression. CXCL10, also known as interferon-inducible protein 10 (IP-10), plays several important roles in immune regulation. CXCL10 is a chemokine that attracts various immune cells, particularly T cells, natural killer (NK) cells, and monocytes/macrophages, to sites of inflammation or tissue injury. It has been well studied previously that production of CXCL10 is inversely correlated with tumor progression which results in a marked reduction in tumor-associated angiogenesis. Other studies highlight the role of CXCL10 in inhibiting the growth of cervical carcinoma through an increase in the apoptotic rate which reasonably explains its downregulation in tumor cells as observed in our work.
Apart from directly looking at the identity of the DEGs, we also found a significant enrichment of GO terms related to cell motility, cell migration, and its regulation among the upregulated genes in tumor samples compared to the normal samples. This enrichment suggests an active involvement of these genes in promoting tumor progression and metastasis. By elucidating the specific biological processes affected by dysregulated gene expression, these findings can contribute to a better understanding of the molecular mechanisms driving TNBC development and open up new potential drug targets.