With a structure in hand for the first segment of scospondin, an analysis of the stl300 mutation site was conducted. The stl300 allele is a missense mutation that converts cysteine-878 to a serine. This cysteine is highly conserved across species (Troutwine et al., 2020). Due to the presence of other conserved cysteines in the region, it was hypothesized that C-878 may participate in a disulfide bond. Upon closer inspection of the structure surrounding the stl300 mutation site, it was revealed that C-878 is positioned in close proximity to cysteine-900. Specifically, only 2.012 Å separate the terminal sulfur atoms of C-878 and C-900. This is below the threshold distance (3 Å) for disulfide bond formation (Sun et al., 2017). However, this region was folded with a mid-level pLDDT confidence score, warranting further investigation.
To search for structural homologs that might also contain disulfide bonds, the region surrounding the stl300 mutation site (C-876 to C-912) was extracted from the AlphaFold2 model and submitted to the DALI server. This tool compares 3D structures of proteins to find structural homologs. When searched against the complete PBD, DALI returned 184 potential hits, of which the top 25 were selected for further analysis. These hits had an average Z-score of 4.5 (range 4.1 to 5.2) and included 16 chains from fibronectins, 5 from mucins, 2 from coagulation factors, 1 from von Willebrand factor, and 1 from FN1. Using the DALI 3D superimposition tool, the carbon-alpha trace of the query sequence was mapped against these hits and colored according to sequence conservation. By this analysis, C-878 and C-900 are colored blue, indicating high sequence conservation at structurally homologous residues. This was further validated using the DALI Structural Alignments tool, which aligns protein sequences based on their structural homology. In all 25 of the top hits, cysteines were present in structurally homologous positions to C-878 and C-900. This was also visualized using the DALI Stacked Logos diagram, in which height represents the degree of sequence conservation.
Since chains of fibronectins and mucins were highly abundant in the top DALI hits, the ChimeraX-1.3 Matchmaker tool was used to align a representative (fibronection: 2rkz-D; mucin: 6tm2-A) protein chain to the region surrounding the stl300 mutation site. Doing so revealed that these structures contain disulfide bonds at structurally analogous cysteines to C-878 and C-900. Altogether, this analysis strongly supports the hypothesis that the stl300 C-878-S mutation disrupts a disulfide bond.
To further explore the structural implications of the stl300 C-878-S mutation, the first third of the stl300 sequence was predicted using AlphaFold2. The top-ranked model had a similar average pLDDT score (77.2) to the wildtype scospondin prediction (77.5). The stl300 structure was aligned to the wildtype structure using ChimeraX-1.3 Matchmaker tool. Upon inspection, the overall protein aligned well with the wildtype structure. However, within the stl300 mutation region, there were large shifts in the location of the carbon chains. This was evidenced by the 2.615 Å shift of C-900 between the two structures. In conclusion, this demonstrates that there are structural consequences to the stl300 C-878-S mutation.
Map of stl300 C-878-S mutation site
Top 25 Structural Homologs from DALI search
Hits against the C-876 to L-910 query sequence included chains from fibronectins (16), mucins (5), coagulation factors (2), von Willebrand factors (1), and FN1 proteins (1).
C-alpha trace of C-876 to L-910
C-878 and C-900 are both highly conserved across structural homologs. Colored according to sequence conservation across the top 25 DALI structural homologs (low to high).
Multiple Structural Alignment of Top 25 Structural Homologs
C-878 (query C-3) and C-900 (query C-25) are both highly conserved across structural homologs.
Stacked DALI logos of C-876 to L-910
C-878 (query C-3) and C-900 (query C-25) are both highly conserved across structural homologs. The height of each residue represents the sequence conservation in the structural alignment.
Mucin-2 Structural Alignment
Structurally analogous cysteines form a disulfide bond. Structural alignment of Mucin-2 (6tm2-A, orange) to stl300 mutation region (C-876 to L-910, blue).
Fibronectin Structural Alignment
Structurally analogous cysteines form a disulfide bond. Structural alignment of Fibronectin (2rkz-D, pink) to stl300 mutation region (C-876 to L-910, blue).