As the AI model was better able to predict the glycoprofile when PHA-E was deglycosylated, this may imply that PHA-E should be deglycosylated when executing glycan-lectin ELISA assays. This also suggests that there could be other lectins that show an improvement in glycoprofile predictions when deglycoslated. However, replicate experiments, as well as experiments on other lectins, are necessary.
Future studies on lectin structure and glycosylation sites could be useful to determine the role of lectin glycosylation. For instance, some lectins exhibited less of a mobility shift than expected after deglycosylation, and denaturing deglycosylation was required to achieve the full extent of deglycocyslation shown by the gel mobility shifts. It is possible that these lectins have glycans that are on the internal, inaccessible parts of their structure. In the future, it would be useful to determine if steric inhibition was the cause of the partial deglycosylation. Further, we need to test if internal glycans are capable of glycan-lectin binding and whether they are crucial to maintaining lectin structure. If lectin glycosylation is not an important component to lectin structure or function, this may suggest that deglycosylation causes no harm and can be completed in glycan sequencing procedures.
Further investigation is also needed to determine why deglycosylating some lectins improved binding affinity while it reduced binding affinity for others. One way to investigate this is by determining conserved and non-conserved glycan regions on lectins. If a glycan site is conserved across multiple protein families of the lectin, then the glycan is likely necessary for protein structure or function. This would suggest that removing the glycan could hinder lectin-binding function, resulting in reduced prediction accuracy. However, if a glycan region is not conserved, it is likely not vital to lectin function, and thus removing the glycan should prevent self-binding or binding from unwanted glycoproteins. However, there may be other explanations for the performance changes. Additionally, it could be useful to directly use a protein-folding simulation software to determine whether lectins fold the same way with and without glycosylation.
Consistent inability to predict glycoprofiles for both glycosylated- and deglycosylated-lectin assays indicates issues with the AI model. Specifically, the lectin-glycan binding assumptions the model was trained on are likely incorrect or incomplete. Additionally, the current model was only trained on glycosylated lectins; future analysis utilizing a model trained on deglycosylated-lectin binding data would be necessary to fully test whether deglycosylating lectins is beneficial.
The AI model may also have performed better than we measured, since the model only outputs the exact predicted glycan. Some glycans have very similar structures, and some of the errors the model made could have been in predicting a glycan that was only slightly different than the true glycan. To determine if this was the case, a future direction would be to investigate the branched structure of each predicted and expected glycan, create a metric to compare similarity between glycans, and provide the model an accuracy score that would give partial correctness for predicting a glycan similar to the true glycan.
Additionally, lectin ELISA data is currently expensive to generate, and there is not enough data to properly train a model. To compensate for this, the model was trained on simulated data generated from literature, not direct lectin ELISA data. Next steps should involve finding a way to more closely simulate the lectin ELISA data. Ideally, it would be best to find or generate more lectin ELISA data to train the model on. However, other options could include training the model in a way that conserves data (such as using nested k-folds, or leave-one-out cross-validation in order to be able to use more data in training), or by generating training data directly from experimental ELISA data rather than mass-spectrometry data.
Page Leader: Kyra Hulse