AlphaFold2 (Jumper et al., 2021a and 2021b) was used to predict the structure of the Danio rerio scospondin protein. The primary amino acid sequence was downloaded from Uniprot (B3LF39) as a .fasta file. Attempts to fold the full length (4990 amino acids) sequence resulted in memory errors when using AlphaFold2 (vs 2.1.0) on TACC. For this reason, the sequence was split into three approximately equal segments. Segment boundaries were chosen based on the reported domain architecture of scospondin (Accession: CAJ44080), such that domains were fully contained within the segments. The first segment included residues M-1 to F-1664; the middle segment included residues T-1665 to D-3320; the last segment included residues W-3321 to G-4990. Domain architecture maps were created using MacVector (vs 18.2.5). The three segments were folded on a Maverick2 allocation of TACC using the reduced template in order to improve speed. The top-ranked .pdb files were uploaded into ChimeraX-1.3 (Pettersen et al., 2020) for visualization of the overall protein structure as well as the region surrounding the stl300 mutation site (C-878). The structures were colored according to the reported domains as well as the AlphaFold2 confidence (pLDDT) score.
The stl300 mutation site (C-878) is contained in the first segment (M-1 to F-1664) of scospondin. ChimeraX-1.3 was used to isolate the region surrounding the stl300 mutation site (C-876 to L-910) from the top-ranked AlphaFold2 prediction of this segment. The .pdb file was then uploaded into the Distance Alignment Matrix (DALI, Holm et al., 2020) server to search for structural homologs within the Protein Data Bank (PDB). The top 25 matches against the full PDB were selected for further inspection. The 3D Superimposition tool was used to color the query C-alpha trace according to sequence conservation. Sidechains with bits greater than 5.51 were displayed. The Structural Alignments tool was used without expanding gaps to generate structural alignments and DALI logo diagrams of sequence conservation at each position. A select few PDB chains were chosen for closer inspection in ChimeraX-1.3. The Matchmaker tool was used to align the stl300 structural region to the selected structural homologs. The Distance tool was used to make various measurements between atoms.