Results
Results
Gene Signatures
qG4-ChIP-seq data was collected from the PDTX model and the data set was downloaded and processed using various available computational tools as shown in the Methods section. Out of the 20+ breast cancer patients, three patients' largest G4-ChipSeq (300Mb+) data was used to perform the analysis out of many fragments of qG4-ChIP-seq. We used a motif that has been well known to form G4 quadruplex structure, namely c-MYC motif [12].
We used the UCSC Human Genome BLAT with hg19 as reference genome to map our sequences to locations of the human chromosome. In theory, if the genetic data was truly from the human genome rather than some fragmented mouse genome, we would expect to see good matches of well-known G4 quadruplex sequences that have shown to be prominent in humans. The information for each patient is shown in Figures 1-6. The sequence matching each motif is shown in chromosome and strand number indicated in red boxes for each patient.
Upon finding the chromosome location and the larger view of the genomic sequence containing each motif sequence per patient, BLAST alignment was also performed to see whether the specific motif matching sequence was involved with some known cancer-related gene in humans. Twelve relevant genes were found, as well as a few irrelevant genes or random sequences that happen to match our sequences. The random irrelevant matches are unsurprising given the vast genomic sequence of the human genome. All the genetic sequences that our group used to BLAST are included in the Genomic Analysis Summary table in the Supplementary Materials tab.
Patient 1
In Chromosome 2: Irrelevant. Did not align to any particular gene sequence related to cancer.
In Chromosome 9: KCNT1 gene with perfect alignment. KCNT1 gene is a well known gene involved in epilepsy [14].
In Chromosome 4: GRB2 (Growth Factor Receptor-Bound Protein 2) gene which encodes an adaptor protein that plays a role in signaling pathways for cellular proliferation. GRB2 is also involved in various cancers [15].
In Chromosome 6: LOC129389432 MPRA-validated peak5632 silencer is a specific transcriptional repressor that has been experimentally validated to reduce an expression of KCNT1 genes [16].
In Chromosome 8: Homo sapiens G00-120-208 (MYC) gene, exons 1 and 2. MYC is a major driver of cancer development when mutated or overexpressed. It is highly involved in the cell cycle [17].
In Chromosome 19: Homo sapiens H3K27ac hESC enhancer. Studies suggest that this is a standard indicator for cancer cells to use this molecule, such that cells divide in an uncontrolled fashion [18].
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Patient 2
In chromosome 7: sidekick cell adhesion molecule 1 (SDK1) gene. SDK1 gene encodes for a protein found in the immunoglobulin superfamily, and increased expression has been found in pancreatic cancer patients [19].
In chromosome 5: intron of GRM6 gene. GRM6 gene is involved in transmitting visual signals from the retina to the brain, and high mutation rates have been found in cutaneous melanoma [20].
In chromosome 19: intron of KDM4B gene. Increased expression of KDM4B is involved with increased DNA damage and can be seen in various cancer types [21].
In chromosome 7: VIPR2 gene. This gene encodes a G-protein coupled receptor for vasoactive intestinal peptide and plays a role in regulating the circadian rhythm, immune system, as well as some other bodily functions. Mutations in VIPR2 were found to be associated with increased schizophrenia risks [22].
In chromosome X: BMX and ACE2 genes. BMX gene encodes for a non-receptor tyrosine kinase and is involved in several transduction pathways, and its expression has been found in various cancers such as glioblastoma, prostate, and breast cancer [23]. ACE2 is involved in the renin-angiotensin system, and was found to be involved in multiple diseases including cancer, respiratory, and neurodegenerative diseases [24].
In chromosome 14: intron of JAG2 gene. JAG2 encodes a protein involved in the Notch signaling pathway, and is found to increase chemoresistance of colorectal cancer cells [25].
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Patient 3
In Chromosome 6: OCT4/NANOG enhancer region. This region regulates stem cell-related genes and has been associated with cancer-related reactivation [26].
In Chromosome 7: SDK1 gene. SDK1 encodes a cell adhesion molecule and may be involved in disease-related processes [19].
In Chromosome 8: MYC proto-oncogene. MYC is a key regulator of cell proliferation and is frequently associated with various cancers [17].
In Chromosome 1: LOC129930557 region. This region is annotated as a regulatory/silent region identified by ATAC-STARR-seq and may be involved in gene regulation [27].
In Chromosome 20: ABHD12 gene. ABHD12 encodes a lysophospholipase involved in lipid metabolism and has been associated with neurological disorders [28].
In Chromosome 21: Down syndrome critical region. This genomic region contains genes associated with trisomy 21 and related developmental disorders [29].
g:Profiler
In addition to analyses discussed above, we also performed g:Profiler analysis using the 12 relevant genes. g:Profiler analysis is an enrichment analysis using a list of multiple genes to correlate to identify relevant biological processes, pathways or cellular components if any [13]. Recall that our genes are mapped to either oncogene-related genes, enhancers/repressors, or some genes involved in other illnesses like epilepsy. In theory, the enrichment analysis should show some cellular pathway involved in cancer. In Figure 7, the Manhattan plot is shown where clusters of dots represent different biological databases on the x-axis. For example, bubbles with Light Blue is a database from WikiPathways shown as WP in the figure. “Significant hits” are plotted as colored “bubbles”. The y-axis demonstrates the adjusted log probability meaning, higher a bubble, the result is more likely statistically significant.
Six significant pathways (KEGG in pink), 2 significant pathways (WP in light blue), and 4 significant protein complexes (CORUM in light green) are shown with statistically significant values (Figure 8). The gene list appears to have specific pathways and molecules that relate to cancer rather than just random signatures.
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------