Software
Description:
Ultra-fast, fully-automated, and accurate de novo macromolecular large complex structure modeling, prediction, and identification from 3D electron microscopy map. We introduce deep learning methods and optimization techniques to predict the locations of the atoms and multi-level structure elements. We are targeting the ab initio all-atom structure prediction problem for extremely large complexes, such as viruses, bacteria, and other molecular machines. The work has been featured in PNAS research highlights and Nature Computational Science research highlights.
Citation:
A. Nakamura, H. Meng, M. Zhao, F. Wang, J. Hou, R. Cao, D. Si*, “Fast and Automated Protein-DNA/RNA Macromolecular Complex Modeling from Cryo-EM Maps”, Briefings in Bioinformatics, 2023;, bbac632, https://doi.org/10.1093/bib/bbac632
D. Si*, J. Chen, A. Nakamura, L. Chang, H. Guan, “Smart De Novo Macromolecular Structure Modeling from Cryo-EM Maps”, Journal of Molecular Biology, 2023, 167967, ISSN 0022-2836, https://doi.org/10.1016/j.jmb.2023.167967
J. Pfab, N. M. Phan, D. Si*, “DeepTracer for fast de novo cryo-EM protein structure modeling and special studies on CoV-related complexes”, Proceedings of the National Academy of Sciences, Jan 2021, 118 (2) e2017525118; DOI: 10.1073/pnas.2017525118. https://doi.org/10.1073/pnas.2017525118
D. Si*, et al. (2022), "DeepTracer Web Service for Fast and Accurate de novo Protein Complex Structure Prediction from Cryo-EM", Algorithms and Methods in Structural Bioinformatics. Computational Biology. Springer, Cham. https://doi.org/10.1007/978-3-031-05914-8_6
DeepTracer Platform:
Media News:
iCare (with Dr. Yuwen Weichao and Dr. Bill Erdly)
Description:
The iCare project aims to provide support for people with mental health issues through two chatbots: the Carebot and Coachbot. The Carebot provides empathetic and therapeutic responses to people with mental health issues, while the Coachbot acts as a virtual coach to help train caregivers on how to interact with people with mental health issues. The project emphasizes the importance of empathetic interaction in therapy and providing training to caregivers to improve patient care.
Description:
iREACH is a project to build an artificially intelligent human-like chatbot that uses research and evidence-based Cognitive Behavioral Therapy (CbT) alongside cutting-edge AI and machine learning techniques to coach caretakers on psychosis patients.
iREACH Website:
C-CNN Backbone Prediction
Description:
We introduce a deep learning model that uses a set of cascaded convolutional neural networks (CNNs) to predict Cα atoms along a protein’s backbone structure. The cascaded-CNN (C-CNN) is a novel deep learning architecture comprised of multiple CNNs, each predicting a specific aspect of a protein’s structure.
Citation:
D. Si*, S.A. Moritz, J. Pfab, et al. “Deep Learning to Predict Protein Backbone Structure from High-Resolution Cryo-EM Density Maps”. Sci Rep 10, 4282 (2020). https://www.nature.com/articles/s41598-020-60598-y
C. L Lawson*, …, J. Pfab, …, D. Si, et al., “Cryo-EM model validation recommendations based on outcomes of the 2019 EMDataResource challenge”. Nat Methods, 18, 156–164 (2021). https://www.nature.com/articles/s41592-020-01051-w
GitHub:
Description:
Where AI and Nursing Meet COCO combines cutting-edge conversational AI technologies, evidence-based therapies, and insights from top care professionals to tailor your care plan.
Citation:
W. R. Kearns, N. Kaura, M. Divina, C. Vo, D. Si, T. Ward, and W. Yuwen. 2020. A Wizard-of-Oz Interface and Persona-based Methodology for Collecting Health Counseling Dialog. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems Extended Abstracts (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–9. DOI:https://doi-org /10.1145/3334480.3382902
S. Zhai, S.C. Cheng, D. Si, T.M. Ward, W. Erdly, W. Yuwen, “Participatory Design of a Tailored Self-Management Program for Caregivers of Children”. Western Institute of Nursing Annual Communicating Nursing Research Conference, Portland, OR, April 15-18, 2020.
CocoBot Website:
Media News:
Psychosis NLP (with Dr. Sunny Cheng)
Description:
We aim to use machine learning to distinguish between the speech of patients who suffer from mental disorders which cause psychosis from that of healthy individuals to improve early detection of schizophrenia.
Citation:
P. Saltz, S.Y. Lin, S. C. Cheng, D. Si*, “Dementia Detection using Transformers-Based Deep Learning and Natural Language Processing Models”, HealthNLP 2021
S. Zhai, S.C. Cheng, D. Si, T.M. Ward, W. Erdly, W. Yuwen, “Participatory Design of a Tailored Self-Management Program for Caregivers of Children”. Western Institute of Nursing Annual Communicating Nursing Research Conference, Portland, OR, April 15-18, 2020.
D. Si*, S. C. Cheng, R. Xing, C. Liu and H. Y. Wu, "Scaling up Prediction of Psychosis by Natural Language Processing," 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), Portland, OR, USA, 2019, pp. 339-347, doi: 10.1109/ICTAI.2019.00055.
C. Liu, H. Y. Wu, R. Xing, T. Quang, S. C. Cheng, D. Si, “Computational Psychiatric Nursing Research: Scaling up the Prediction of Psychosis by Natural Language Processing”, Proceeding of the 11th International Conference on Early Intervention in Mental Health, Boston, Massachusetts, USA, 7th–10th October 2018.
GitHub:
Auto-Thresholding
Description:
In order to computationally process cryo-EM maps, an electron density threshold level is required which defines a lower bound for density values. A customized threshold level has to be selected for modeling or structure prediction from the cryo-EM maps in order to reduce noise. Automatizing this threshold selection process makes it easier to run predictions as well as it removes the dependency of the prediction accuracy to the ability of someone to choose the right threshold value. We present a method to automatize the threshold selection for problems which require a density threshold level. The method uses the surface area to volume ratio and the ratio of voxels that lie above the threshold level to non-zero voxels as metrics to derive characteristics about suitable threshold levels based on a training dataset.
Citation:
J. Pfab, D. Si*, “Automated Threshold Selection for Cryo-EM Density Maps”, BCB '19: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, September 2019, Pages 161–166, https://doi.org/10.1145/3307339.3342190
GitHub:
Resolution Validation and Estimation
Description:
We first propose supervised deep learning methods to extract representative 3D features at high, medium and low resolutions from simulated protein density maps and build classification models that objectively validate resolutions of experimental 3D cryo-EM maps.
Citation:
T. K. Avramov, D. Vyenielo, J. G. Blanco, S. Adinarayanan, J. Vargas*, D. Si*, “Deep Learning for Validating and Estimating Resolution of Cryo-Electron Microscopy Density Maps”, Molecules, 24(6), 1181, 2019. DOI: 10.3390/molecules24061181.
T. Avramov, D. Si*, “Deep Learning for Resolution Validation of Three Dimensional Cryo-Electron Microscopy Density Maps”, In Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics (BCB '18). ACM, New York, NY, USA, 669-674. DOI: https://doi.org/10.1145/3233547.3233712
GitHub:
EMNets - "QR code generator" for 3D EM
Description:
EMNets provides an effective and accurate way of protein surface representation and similarity search, and thus contributes to biomedical research. The method uses a Convolutional Autoencoder (CAE) neural network to learn the geometric information of three-dimensional (3D) density maps in a data-driven manner. Our method effectively represents a 3D cryo-electron microscopy density map by using a descriptor consists of only 256 numeric variables which is called EMNets descriptor. Based on EMNets descriptor, we are able to retrieve similar protein surfaces using k-nearest-neighbor algorithm in real-time. The search results of protein surface represented with the EMNets descriptor has shown high agreement with the existing Combinatorial Extension (CE) algorithm of sequence and structure similarity search. Overall, EMNets is a powerful tool in comparing 3D protein structures obtained by cryo-electron microscopy.
Citation:
J. Yang, R. Cao, D. Si*, “EMNets: A Convolutional Autoencoder for Protein Surface Retrieval Based on Cryo-Electron Microscopy Imaging”, In Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics (BCB '18). ACM, New York, NY, USA, 639-644. DOI: https://doi.org/10.1145/3233547.3233707
VolumeCut
Description:
Two tools VolumeCut and VolumeCompare that the previous undergraduate student Michael Nissenson developed in lab can be used by the larger cryo-EM community. Both tools have been implemented as add-ons that can be used inside UCSF Chimera, which is one of the current most popular cryo-EM visualization platforms. The first tool allows users to quickly generate experimental training data to be fed into machine learning algorithms, reducing the time it takes to create this type of data from hours to seconds. The second tool provides a rough estimate of the quality of experimental cryo-EM data, allowing them to find out if some data is missing from the experimental data.
Citation:
M. Nissenson, D. Si*, “Automated Protein Chain Isolation from 3D Cryo-EM Data and Volume Comparison Tool”, In Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM-BCB 2017). ACM, New York, NY, USA, 685-690. DOI: https://doi.org/10.1145/3107411.3107500
S. Mason, F. Jagodzinski, B. Chen, M. Nissenson, D. Si, A. Valliani, A. Soni, X. Fang, W. Qiao, and A. Shehu. 2017. ACM-SIGBIO undergraduate research highlight. ACM SIGBioinformatics Rec. 7, 2, Article 2 (October 2017), 3 pages. DOI: 10.1145/3148241.3148243.
GitHub:
StrandTwister
Description:
A secondary structure detection tool aimed to detect beta-strands from protein electron cryo-microscopy (Cryo-EM) density maps at medium-low resolutions (5-15 Å).
Citation:
D. Si, J. He, "Tracing Beta Strands Using StrandTwister from Cryo-EM Density Maps at Medium Resolutions", Structure - Cell Press, p1665-1676, Volume 22, Issue 11, 2014. DOI: 10.1016/j.str.2014.08.017.
D. Si, J. He, “Orientations of Beta-strand Traces and Near Maximum Twist”, Proceedings of the ACM International Conference on Bioinformatics, Computational Biology and Health Informatics (ACM-BCB), p690-694, Newport Beach CA, Sept 20-23, 2014.
D. Si, J. He, "Beta-sheet Detection and Representation from Medium Resolution Cryo-EM Density Maps", Proceedings of the ACM International Conference on Bioinformatics, Computational Biology and Biomedical Informatics (ACM-BCB), p764-770, Washington, D.C., September 22-25, 2013.
Website:
SSETracer and StrandTwister homepage (NSF funded project: DBI-1356621, PI: Jing He)
SSELearner
Description:
A machine learning approach for the detection of secondary structures from 3D protein Cryo-EM density maps at medium resolutions.
Citation:
D. Si, S. Ji, K. Al Nasr, J. He, "A Machine Learning Approach for the Identification of Protein Secondary Structure Elements from Electron Cryo-Microscopy Density Maps", Biopolymers, Volume 97, Issue 9, pages 698-708, 2012.