This Page is Interactive
Click the analysis title to unfold the results for each protein
The dots below the photos indicate multiples; click left/right arrows to view
Click the black labeled buttons below each analysis to view the entirety of the results and the associated phylogenetic tree
The four proteins that are the focus of this study, the DNA binding protein ethylene-responsive transcription factor 3 (EREBP-3), leucine zipper protein, coronatine-insensitive 1 (Bunsupa 2012), and 13-hydroxylupanine O-tigloyltransferase (T., Okada. 2005), were run through BLAST sequencing on uniprot, and the results (shown in Figure 3 below), was organized into taxonomic trees to explore the frequency of similar sequences.
Correlating with what I would expect from the literature of Bunsupa and Okada, the taxonomic trees show an overlap of the 50kb inversion clade. This clade includes Arachis (peanut), Trifoliaea, and Lupinus, which is a grain legume within the Genisteae Tribe and is known to contain quinolizidine alkaloids (QAs). One of the most visible and consistent Lupinus overlaps which spans all four proteins seems to be Lupinus angustifolius, which is one of the commonly used lupin crop species where QA is known to be more consistently moderated. Another interesting distribution that the taxonomic trees made visible is the presence of similar proteins in the Phaseolea Tribe, specifically the overlap of all four protein similarities in Munuca pruriens (Velvet bean) and Phaseolus vulgaris (Kidney bean and French bean), which could be an interesting topical branch once more of the proteins involved in QA become known.
Figure 3. Taxonomic Trees of the proteins
The fasta files resulting from the BLAST searches were then run on the Clustal Omega tool (Madeira 2019) with the intention of comparing the similar proteins and visualizing them with Jalview (Waterhouse 2009). The written drop-down sections below analyze the results of each of the four proteins, as visualized with Jalview. Clicking on the black buttons below each analysis will open a new window with which you can view the entirety of the results, and the associated phylogenetic tree.
BLAST results from the Ethylene-responsive Transcription Factor 3 (EREBP-3) protein search are displayed in the overview photo in the image carousel below. This overview illustrates that there is one major region with significant amounts of conserved areas spanning from 552 to 608. This region has enough quality to produce the genetic consensus shown in the bottom of the second photo. This region stands out as the only majorly conserved area in this result, with only a handful of mildly conserved patches after it. The third photo shows the conserved area with percent identity scores indicated by light to dark blue, showing the extreme protein similarity in this area.
When looking at the phylogenetic tree for this protein set, available with the button below, the evolutionary distance can be approximated between the proteins. the proteins associated with Lupinus (notated as LUPAN) appear several times clustered in nodes at different points in the tree. As this happens at least five times over the length of the tree, it can be assumed that the proteins may have evolved multiple times. The closest proteins to these, when on a different branch of the same node, include proteins found in soybean (SOYBN), indicating they may share a closer evolutionary relationship.
Results from the Leucine Zipper protein BLAST search are shown in the overview photo in the image carousel below. Similar to the last data set, there is one major area that is highly conserved. This area is between 672 and 729 in the dataset, and have a high quality consensus shown in the second photo. This consensus shows that there is a large area of positively charged amino acids, as evidenced by the thick stripe of red, which is the associated Clustal X color. The third photo, showing the percent identity for this area, indicates that the columns with high conservation have anywhere from high percent identity to low identity. This patchy result indicates that there is an inconsistent similarity across the amino acid sequences.
The phylogenetic tree associated with this dataset, available below, shows the protein associated with Lupinus on the same node as a protein associated with Arachis hypogaea (peanut)(ARAHY), indicating that they may share a closer evolutionary distance.
Coronatine-insensitive 1 BLAST results are displayed in the overview photo below. As the yellow bars and the bottom of the photo show, there are several areas of sporadic conservation. The area showing the most densely clustered conservation is shown in the second photo in the image carousel, ranging from 1974 to 2066. This area shows three areas with moderate to high conservation and enough quality to produce a consensus for the highly conserved columns. The third photo, showing percent identity scores by using light to dark blue, shows that most of the areas with high conservation have a moderate to high score. The handful that do not, indicate that they do not have a consistent alignment similarity.
The phylogenetic tree associated with this data set, available below, shows the proteins associated with Lupinus (designated as LUPAN in the tree) in two separate regions, indicating that it may have evolved multiple times. The first region has three Lupinus proteins on their own node, with the closest associated node containing soybean, Phaseolus vulgaris (kidney bean and french bean)(PHAVU), and Mucuna pruriens (velvet bean)(MUCPR). This indicates the possibility of a close evolutionary relationship between these proteins.
BLAST results of the 13 hydroxylupanine O-tigloyltransferase protein search are displayed in the overview photo in the image carousel below. As is visible from the yellow bars at the bottom of the image, there are several areas with highly conserved regions. The first highly conserved area contains the genes between 514 and 536, with enough quality to produce a genetic consensus shown in the second photo. Interestingly, there is then a large section of conserved clusters spanning from 583 to 1030, showing that these protein sequences share a large number of similarities.
One of the most densely conserved areas is from 716 to 776 illustrated in the third and fourth photos. When viewing the percent identity scores, shown in the fourth photo, dark blue regions match with the highly conserved area, confirming that the BLAST results have a high level of similarity across the proteins.
When looking at the phylogenetic tree, available in the link below, the evolutionary distance can be approximated between the proteins. Since this is an unrooted tree, the distance is relative. One interesting result is that the protein associated with Lupinus (notated as LUPAN in the tree) is on its own branch, with the closest related proteins being one node back on a different branch, making it fairly unique to itself. Further laboratory investigation of how interchangeable the soybean proteins and lupin protein really are might result in further understanding of how important this protein is and its flexibility.
The datasets for each of the four proteins show areas with conserved similarity, that give clues to the consistent base of each protein. Meanwhile, the phylogenetic trees indicate where any possible evolutionary relationships may be in regard to the Lupinus protein (which is known to be associated with QA); with EREBP-3 showing a possible relationship with soybean, Leucine Zipper protein showing a closeness to peanut, and Coronatine-insensitive 1 sharing a relationships with soybean, Phaseolus vulgaris (kidney bean and french bean), and Mucuna pruriens (velvet bean). Meanwhile, 13 Hydroxylupanine O-tigloyltransferase seems to be a loner, with none of the other BLAST proteins sharing a close node. Further laboratory investigation of how interchangeable the evolutionarily close proteins and lupin protein really are might result in further understanding of how important this protein is and its flexibility in the quinolizidine alkaloid biosynthetic pathway.