The full protein structure of both genes could not be determined by MODELLER. This is because there are no homologous protein structures for these two genes. I also tried predicting the structure for each protein by using RaptorX and i-TASSER. These gave full ab-initio protein structures, but turned out to be inaccurate as there were many random coils in the predicted structure. Also, the C-score for many of the predicted models were between -2 and -3. The boundaries for C-score are -5 to 2, further showing how inaccurate the predicted structures were for i-TASSER.
HHPRED predicted that CLPTM1 has 5 transmembrane helices whereas CLPTM1L has 6. The closest homologous structure to the regions that could be detected, was 5xpd. 5xpd is a glucose transport protein from arabidopsis thaliana encoded by the SWEET13 gene (Sugar will eventually be exported transporter). [11] Humans contain a single homolog for SWEET, which is RAG1AP1 . [12] RAG1AP1 (SWEET1) has been identified as a candidate gene for the efflux of glucose in some human cells. Failure of this gene may cause hypoglycemia. This is because glucose reabsorption occurs in the kidneys, and if glucose can't be transported out into the capillaries, the body will urinate glucose, leading to low blood glucose. This gene may be related to Fanconi syndrome, leads to faulty ion reabosorption from the kidneys.
SWEET and the CLPTM1 genes have been identified as PQ-Loop proteins. [13] PQ-Loop proteins have been predicted to have 7 transmembrane helices that function in protein trafficking and vesicle transport.
Predicted structure of PQ-Loop proteins [Figure 2][13]
Phylogenetic tree for CLPTM1, CLPTM1L and SWEET1 (From Phylogeny.fr) [14]
Some PQ-loop proteins and their functions. Shows that PQ loop proteins are implicated in many important areas of the body [13]. They have not yet been identified as interacting genes, but many have implication of deleterious birth defects such as cleft lip and palate.
Table 1 [13]
In the ExAC browser, 151 and 148 variants were determined to be possible cleft lip/palate causing variants, for CLPTM1 and CLPTM1L respectively. Plausibility for CL/P variants were determined by seeing whether or not the filtering allele frequency was less than the max credible allele frequency. The filtering allele frequency for each variant took the race(ethnicity) that had the highest number of variants. The value was calculated by a built in calculator in ExAC. The max population allele frequency was calculated by multiplying the disease prevalence and genetic heterogeneity, and dividing this value by the penetrance. Genetic heterogeneity came from a paper[8] that showed a mutation in the promoter region of HYAL2 caused 4% of CL/P in a population. Further calculations were described in the methods section.
For all non-synonymous amino acid mutations in ExAC, the position of mutation was plotted on a histogram. For CLPTM1L, many of the mutations were spread out randomly, suggesting that no single region had a higher propensity for mutation than others. CLPTM1 yielded different results, as many frequent mutations occurred in the first 200 amino acids of the gene. This may imply that the beginning region of the gene may not have important functions.
CLTPM1 and CLPTM1L differ in the length of their conserved regions. From BLAST and multiple sequence alignment, CLPTM1L showed to have shorter conserved regions than CLPTM1. Conserved regions in a genome imply important functional value. Since CLPTM1L has less and shorter conserved region, the gene may be less prone to loss of function due to mutations than CLPTM1.
Perhaps most striking is the number of mutations in conserved regions for cancer patients. High and moderate impact mutations were in each gene were mapped and shown in the results section. The criteria for high and moderate is based on the variant effect predictor (VEP). These mutations lead to non-synonymous amino acid mutation that heavily affect the gene in cancer patients. For CLPTM1L, 40% of high/moderate impact mutations were in conserved regions. Whereas CLPTM1, 77% of high/moderate impact mutations were in conserved regions. This is consistent with my hypothesis that mutations affecting CLPTM1 would result in more aberrant functions because it has longer conserved regions.
Types of cancer affected by CLPTM1 (From GDC cancer portal)
Types of cancer affected by CLPTM1L (From GDC cancer portal)
CLPTM1 and CLPTM1L seem to mostly affect uterus and skin cancer from the data gathered from GDC cancer portal. This may suggest that the CLPTM1 gene family may be responsible for propagating these types of cancers. Perhaps these two genes may be linked in those two cancers. However, no literature has supported any interaction between these two genes. Only that they share the same protein domain
Network from GeneMania shows that both genes only share protein domains. They don't interact with each other
From the STRING network, only two gene nodes were connected to CLPTM1: ICT1 and PTBP3.
ICT1 (immature colon carcinoma transcript-1 ) is a gene that encodes for a peptidyl tRNA hydrolase that functions as a release factor for the mitochondrial ribosome. [15] The gene has also been implicated in colon cancer. ICT1 has been shown to have a functional link to CLPTM1, but little is known about the interaction.
CLPTM1 was also linked to PTBP3 (Polypyrimidine tract binding protein 3), a gene that encodes for a protein that mediates pre-mRNA splicing. However, no data was suggested to show any interaction between the two genes. The link in STRING was due to textmining, which proved to be inconclusive. The lack of node links for CLPTM1 shows that we know little about this gene, since its discovery 20 years ago. This discovery showed a chromosomal translocation leading to clefts in a family. Another paper recently published determined that CLPTM1 also interacts with GABA signaling, which has been implicated in cleft palate for animals. [16]
Only one protein node in the STRING network is worth noting. The rest aren't functionally or spatially related to the gene. They were only found by textmining or have little relation to the gene. The TERT gene is the only node worth mentioning. From STRING, the TERT gene is co-expressed, showing that these two genes interact.
TERT - Telomerase reverse transcriptase is very active in cancer cells. TERT helps keep the telomeres on chromosomes from shortening, expanding their lifetime. Overexpression of TERT in cancer cells has led to cancer proliferation. [17] The TERT-CLPTM1L locus on the 5th chromosome has been linked to cancer susceptibility. Susceptibility may be due to overactive telomerase and overexpressed CLPTM1L. CLPTM1L is also known as cisplatin resistance-related protein 9 (CRR9p). Overexpression of CLPTM1L may prevent cancer cells from cisplatin mediated apoptosis, which is a mechanism for many cancer treatments. Also, CLPTM1L has been shown to promote growth in pancreatic cancer cells in vivo and vitro.[18]
Overall, little is still known about these two genes. It can most likely be concluded that each have minimal effect on cleft lip/palate