Fall 2025 Dates - Every Tuesday
Sept 16, 23, 30
Oct 7, 14, 21, 28
Nov 4, 11, 18, 25
Dec 2, 9
Colloquia (mandatory for all researchers)
Tuesdays Every Week @ 7:00 PM - 8:30 PM (EVERY WEEK!)
https://us06web.zoom.us/j/83346956991?pwd=STJ1SGFUK1VtMjdNRThLKy9KdHNlZz09
Meeting ID: 833 4695 6991 Passcode: 699214
Check out the latest Colloquia uploaded to our YouTube Channel!
Department of Computer Science & Engineering
Enhancing Document Search with Keyword-Based Image Retrieval
Searching for text within a PDF is a simple task, with commands such as CTRL-F providing quick searches, but searching for specific images in an academic document can take substantial time and work. While you could scroll through a document and look for an image, that is not the most efficient way to address the problem. In this project, we aim to create a service tool that can allow a reader of a PDF to search for images within a PDF, while only having to provide keywords that detail their search. Extensive evaluations of many different image recognition models and LLMs that build up the backend of this tool were done to provide accuracy of results and efficiency of the service.
RESEARCHERS: Aaron Ely, Basis Independent Fremont '28
ADVISOR: Liu Lab, Software Engineering
KEYWORDS: Image Retrieval | LLMs | Computer Science | AI | Computer Science, | Python | Image Retrieval
Department of Computer Science & Engineering
Comparative study on three machine learning models in novel autonomous drone-based detection of invasive plant brassica nigra
California spends around $82 million to manage invasive plants each year. We propose a solution to automate the detection of invasive plant species by creating a machine learning model capable of identifying the presence of Brassica nigra—an annual herb which increases wildfire risk and produces chemicals that prevent the germination of native plants—from autonomous drone footage. We tested three different machine learning models for the detection of the invasive plant from our drone footage at different angles and distances. The three models were a Convolutional Neural Network (CNN), Stochastic Gradient Descent Classifier (SGDC), and eXtreme Gradient Boosting (XGBoost). The goal was to find the best model type for this application. We hypothesized that for the detection of invasive plant species from aerial autonomous drone images, a CNN model will outperform SGDC and XGBoost because of its ability to extract spatial features to find complex visual patterns. Additionally, we hypothesize that SGDC will perform better than XGBoost, as our data is linearly separable and SGDC has the ability to do limited feature extraction. Results analyzed by using the values of the heatmap of each model indicate that there is a statistically significant difference between the ability of the three models to find important features with the ANOVA test, achieving a p value of 9.2e-16 at an alpha level of 5%. We conclude that CNNs are the most suitable model for detecting invasive plants from drone footage surpassing the other two models with an accuracy of 99.4%.
RESEARCHERS: Chloe Ho, Basis Independent Silicon Valley '26; Sahiti Pantangi, Washington High School '28
ADVISOR: McMahan Lab, Quantum Computing & Computer Science
KEYWORDS: Invasive plant | Machine learning | Convolutional Neural Network | Autonomous Drone
Department of Chemistry, Biochemistry & Physics
Evaluation of machine learning models for the classification of optimal coupling agents in amide coupling reactions
Reaction optimization is a very time, resource, and labor intensive process, as the optimal reaction conditions depend highly on substrate identity, and require extensive fine-tuning of synthetic conditions to arrive at the highest-yielding conversions. The multidimensionality of the data makes it suited for an approach involving machine learning, which could predictively identify optimal reaction conditions given a particular substrate feature set. Herein, we report a platform for standardizing and filtering open source reaction data from ORD (Open Reaction Database) and using this machine-readable dataset to train thirteen machine learning models, including linear, tree-based, kernel method, instance based, neural network, and ensemble architectures, in the yield prediction and classification of coupling agents in amide coupling reactions, which comprise a significant percentage of reactions performed in a medicinal chemistry setting. While yield prediction remained a difficult task for our models due to the complexity of our reaction data, our models performed with great accuracy when classifying reactions to their ideal coupling agent category, including carbodiimide-based, uronium salt, and phosphonium salt. To further validate this approach, we deployed our classification models on isoxazole coupling reaction data generated in our lab, and it successfully categorized the reactions by coupling agent type. Our results demonstrate that kernel methods and ensemble-based architectures perform significantly better than other models such as linear or single tree based. Additionally, molecular environment features, captured by XYZ coordinates, three-dimensional features, and Morgan Fingerprints around reactive functional groups, boosted model predictivity more than bulk material properties such as molecular weight, LogP, and SMILES.
RESEARCHERS: Abhinav Chalasani, Mission San Jose High School '26; Aarav Anand, Lynbrook High School '27
ADVISOR: Njoo Lab Synthesis | Physical Organic Chemistry | Catalysis | Chemical Biology | Spectroscopy | Medicinal Chemistry
KEYWORDS: Amide Coupling Reactions | Reaction Optimization | Coupling Agent Classification | Reaction Yield Prediction | Machine Learning
McMahan Lab - Quantum Computing & Computer Science
Chloe Ho, Basis Independent Silicon Valley '26
Sahiti Pantangi, Washington High School '28
Abhinav Chalasani, Mission San Jose High School '26
Aarav Anand, Lynbrook High School '27
Department of Computer Science & Engineering
Evaluating the Impact of Playback Speed on Automatic Speech Recognition System (ASR) Transcription Accuracy
In speech-to-text (STT) systems, transcription efficiency and cost are strongly influenced by the duration of the input audio. Increasing playback speed shortens file length, which reduces both processing time and cost. However, increasing playback speed may come with a cost to accuracy. The metric we used to determine the optimal playback speed for STT models was Word Error Rate (WER), a standard Automatic Speech Recognition (ASR) metric that measures the proportions of substitutions, deletions, and insertions relative to the reference transcription (23). We hypothesized that playback speeds up to 1.25 times would maintain a WER below 0.2 (20%), while higher playback speeds would exceed a WER of 0.3 (30%). We sped up audio files by 4 factors (100%, 125%, 140%, and 150%) using the LibriSpeech dataset and tested them on 4 models: OpenAI Whisper, Microsoft Azure STT, Deepgram, and Google Cloud STT. For each playback speed, we split up our data into 5 Word Per Minute (WPM) bins using their original speed: 140-149, 150-159, 160-169, 170-179, and 180-18, allowing us to determine the cause of changes in the WER. Our results show that the WER remains stable up to 1.25 times for Azure and Deepgram, consistently maintaining a WER of less than 0.2, while Google and Whisper exceeded this threshold at the same speed. At higher playback speeds, all four systems showed a significant degradation in accuracy. These findings suggest that while 1.25 times playback offers a cost-efficient compromise for some systems, the “optimal” threshold varies depending on the model.
RESEARCHERS: Snata Mohanty, Dougherty Valley High School '26
ADVISOR: Liu Lab, Software Engineering
KEYWORDS: Automatic Speech Recognition (ASR), Word Error Rate (WER), Playback Speed, Speech-to-Text (STT), Transcription Accuracy, Model Robustness
Department of Computer Science & Engineering
DeepBERTa: A DeepSMILES Driven BERT Model for Molecular Property Prediction
Molecular machine learning is a field where computer science techniques are applied to solve chemical problems, such as predicting molecular properties or accelerating drug discovery. In recent years, deep learning models—including transformers, graph neural networks (GNNs), and recurrent neural networks (RNNs)—have shown strong performance in many chemical applications. These models typically rely on large molecular datasets (hundreds of millions to billions of compounds) and require a suitable molecular representation to process the input effectively. One of the most widely used representations is SMILES, a linear string notation for molecules. However, a newer variant called DeepSMILES simplifies the syntax and has been shown in recent studies to improve performance in some tasks. Despite the field's shift toward large-scale deep learning, little work has explored training transformer models directly on DeepSMILES. In this project, we introduce DeepBERTa, a DeepSMILES-based transformer built on the established ChemBERTa architecture and attempt to train it on millions of molecules to evaluate whether DeepSMILES can outperform SMILES in certain tasks and paradigms. Initial results suggest that DeepBERTa is comparable to ChemBERTa in a Blood-Brain Barrier Penetration classification task.
RESEARCHERS: Aayush Kothari, Mission San Jose High School '27; Sourish Rikkala, Fair Lawn High School '27
ADVISOR: Akl Lab, Machine Learning for Condensed Matter Physics
KEYWORDS: Molecular Machine Learning | Drug Discovery | DeepSMILES | DeepBERTa | Natural Language Processing
Department of Chemistry, Biochemistry & Physics
Scalable formal synthesis of (R)-(+)-etomoxir without pyrophoric reagents enabled by benchtop NMR
Etomoxir is a covalent inhibitor of CPT1, a transmembrane mitochondrial protein that acts as the rate-limiting enzyme for fatty acid oxidation. This enzyme plays a major role in metabolic diseases such as diabetes, where regulation of fatty acid biosynthesis and β-oxidation kinetics through CPT1 are effective treatments for such diseases. The 4-Cl phenolic ether on (R)-(+)-etomoxir is a key SAR hotspot for enabling isoform selective inhibition of CPT1. Previously reported syntheses either require early installation of a 4-Cl phenolic ether which precludes the potential for late stage aryl substitution, or employ large scale pyrophoric reactions in early synthetic operations which are challenging to scale. We demonstrate the scalability of a new synthetic route to intercept a late-stage allylic alcohol in route to (R)-(+)-etomoxir. Notably, our alternate retrosynthetic disconnection, which proceeds through a catalytic aerobic oxidation and a one-flask tandem aldol condensation- reduction sequence, to install a key allylic methylene, avoids pyrophoric materials such as n-butyllithium. With a scalable synthesis of a key diversifiable intermediate in hand, our laboratory is currently preparing a library of diverse (R)-(+)-etomoxir analogs to more fully interrogate the SAR of the aryl ring in CPT1 inhibitory activity.
RESEARCHERS: Jacqueline Shan, The King's Academy '26; Sophia Bagley, The Harker School '26
ADVISOR: Njoo Lab Synthesis | Physical Organic Chemistry | Catalysis | Chemical Biology | Spectroscopy | Medicinal Chemistry
KEYWORDS: Formal Synthesis | Catalysis | Spectroscopy
Department of Biological, Human & Life Sciences
Comparative genomics to discover novel relationships between sharks and humans in the context of thymus development
Genomic studies have shown that sharks have mechanisms for greater cancer resistance than humans do. For example, these sharks have multiple overexpressed genes that are potential tumor suppressors, which downregulate cell proliferation in humans, thereby inhibiting oncogenesis. In this study, we compare the genomes, transcriptomes, and proteomes of commonly researched sharks to normal and cancerous human equivalents to determine whether there are any potential homologies between the genomes, which could indicate possible novel relationships. Sharks are an ideal organism to perform comparative genomics analysis on because of their increased cancer resistance from various factors like overexpression of tumor suppressor genes and mechanisms for greater genomic stability (which prevents cancer causing mutations). So far, we have used Ensembl to align tumor suppressor genes of different shark species to pinpoint a smaller region of interest for further analysis. With more specific DNA segments, we plan to use ortholog mapping, which traces genes from different species to a single gene from the most recent common ancestor, and protein-protein interaction networks, which reveal how genes are involved in the progression of disease. Additionally, we plan to use meta-RNAsequence analysis across both the shark and human transcriptomes. We hope to discover novel gene signatures in sharks and humans in colorectal cancer.
RESEARCHERS: Gautam Sharma, Mission San Jose High School '28; Jaanvi Dronamraju, Newark Memorial High School '27
ADVISOR: Cunha Lab, Bioinformatics and Cancer Biology
KEYWORDS: Cancer | DNA | Novel relationship | Genomes
Akl Lab - Machine Learning for Condensed Matter Physics
Aayush Kothari, Mission San Jose High School '27
Sourish Rikkala, Fair Lawn High School '27
Jacqueline Shan, The King's Academy '26
Sophia Bagley, The Harker School '26
Cunha Lab - AI & Machine Learning
Gautam Sharma, Mission San Jose High School '28
Jaanvi Dronamraju, Newark Memorial High School '27
Department of Computer Science & Engineering
ASDRP iOS App Development: Status & Roadmap
ASDRP faces challenges in managing its large-scale research program, including scattered information, an overload of calendar invites, and reliance on a costly attendance system. These disconnected tools lead to inconsistent communication and increased manual effort. To address this, we are developing the ASDRP Mobile App as a centralized platform to streamline program management. Tailored content ensures that students and advisors only see information relevant to their department and lab group, reducing noise from excess emails and invites. Key features include a student networking system for showcasing research, a location-based sign-in system to automate campus hour tracking, and an AI chatbot for program inquiries. Developed in Swift with Firebase as the backend, the app currently supports iOS with plans to expand to Android. By unifying ASDRP’s communication, scheduling, and attendance tracking, the app reduces administrative burden, enhances student engagement, and strengthens collaboration across the program.
RESEARCHERS: Grant Hur, The King's Academy PSP '26
ADVISOR: Liu Lab, Software Engineering
KEYWORDS: Mobile App Development | Swift (iOS) | Firebase Backend | Location-Based Sign-In | AI Chatbot
Department of Computer Science & Engineering
Evaluating the capabilities of Large Language Models to give Food Recommendations
Large Language Models (LLMs) are an emerging technology capable of recognition, summarization, translation, prediction, and content generation using extensive datasets. Several studies have explored different applications of LLMs such as translation, education, and healthcare. The purpose of this study is to explore a new area of personalized food and restaurant recommendations. We have created a framework and analyzed the potential of the top 3 of most advanced LLMs to provide reliable, detailed, and creative recommendations for food-related queries. Our finding show that LLMs are capable of outperforming traditional recommendation systems, with GPT scoring the highest overall score, followed by Gemini and DeepSeek (weakest performance). However, these LLMs still possess limitations such as inconsistent location accuracy, vague handling of affordability, and impractical suggestions for convenience. Our results highlight both the strengths and weaknesses of LLMs as restaurant recommendation systems.
RESEARCHERS: Myra Malhotra, Saint Francis High School '26; Vihaan Mittal, American High School '27; Saidhanush Gambhirrao, California High School '28; Soham Jani, Foothill High School '26
ADVISOR: Qin Lab, AI & Machine Learning
KEYWORDS: Large Language Models | Artificial Intelligence | Engineering
Department of Chemistry, Biochemistry & Physics
Anticancer Synthetic Arylsulfonamides with Wnt1-Modulating Activity
The Wnt1/β-catenin signaling pathway plays a vital role in embryonic development, organogenesis, tissue homeostasis, and cell survival, by carefully regulating dynamic homeostasis of the multifunctional protein β-catenin. Disruption of this regulation as a consequence of Axin and APC mutations can lead to abnormal β-catenin accumulation, a known driving factor in the development and progression of several human cancers. Previous studies have identified methyl 3-{[(4-methylphenyl)sulfonyl]amino}benzoate (MSAB) as a selective inhibitor of the Wnt1/β-catenin signaling pathway. To explore the structure-activity relationship on modifying the aniline and sulfonyl phenyl moiety of MSAB and their effects in vitro, we prepared a library of analogs with variously substituted phenyl, alkyl, heterocyclic, and saturated ring systems. Through MTT assays, we observed analogs with the methyl ester derivative showed significantly more activity than their ethyl ester counterparts and both 4-substituted esters exhibited significantly attenuated antiproliferative activity. We also observed that para-substitution of the sulfonyl phenyl moiety exhibited more dose-dependent inhibition of the Wnt1 pathway than their meta-substituted counterparts. Further, through a TCF/LEF-activated luciferase reporter cell assay, the 4-substituted methyl ester analogous to MSAB exhibited slightly reduced Wnt1-inhibitory activity, while 3- and 4-substituted ethyl esters exhibit minimal Wnt1-inhibitory activity. Additionally, we observed that para-substitution of the sulfonyl phenyl moiety exhibited more dose-dependent inhibition of the Wnt1 pathway than their meta-substituted counterparts. This difference in potency might be attributed to several factors that ultimately drive antiproliferative activity, prompting further investigation of these compounds as Wnt1-based antiproliferative agents.
RESEARCHERS: Lavernie Chen, Santa Clara High School '28; Allyson Yu, BASIS '27
ADVISOR: Njoo Lab Synthesis | Physical Organic Chemistry | Catalysis | Chemical Biology | Spectroscopy | Medicinal Chemistry
KEYWORDS: Organic Synthesis | Arylsulfonamides | Wnt-1/β-catenin | Structure Activity Relationship | Medicinal Chemistry
Department of Biological, Human & Life Sciences
Bispecific antibody for AML therapy
Acute Myeloid Leukemia (AML) is an aggressive hematologic malignancy characterized by the expansion of abnormal myeloid progenitors in the bone marrow and bloodstream. Current therapies—including chemotherapy, FLT3 and IDH inhibitors, and antibody-drug conjugates—are often limited by relapse, toxicity, and poor long-term survival. Immunotherapies such as CAR-T cells show promise but face significant safety and scalability challenges. To address the urgent need for targeted strategies, we engineered bi- and trispecific antibodies designed to improve selectivity for AML cells while enhancing T cell–mediated cytotoxicity.
Our bispecific constructs pair an antigen-binding domain recognizing AML-associated markers (CLL-1 or TIM-3) with a CD3-binding arm that recruits T cells. In vitro assays using HL-60 and THP-1 leukemia cell lines confirmed strong binding affinity and demonstrated potent cytotoxicity against CLL-1⁺ and TIM-3⁺ populations, while sparing normal hematopoietic progenitors. Building on these results, we developed a trispecific antibody incorporating CLL-1, TIM-3, and CD3 recognition. This design exploits co-expression of CLL-1 and TIM-3 on leukemic stem cells, achieving higher potency against double-positive targets while reducing off-tumor effects. Preliminary data confirm that trispecific constructs enhance tumor cell killing, mitigate immune escape, and exhibit reduced toxicity compared to bispecific formats.
Future directions include in vivo efficacy studies and extension of this platform to other tumors such as ovarian cancer. Collectively, our findings highlight trispecific antibodies as a promising next-generation immunotherapy approach for AML, capable of integrating potency, selectivity, and safety into a single molecular design.
RESEARCHERS: TBH
ADVISOR: Wang Lab, Molecular & Cell Biology
KEYWORDS: AML immunotherapy | Acute Myeloid Leukemia treatment | bispecific antibody AML | novel antibody therapy for AML | next‑generation cancer immunotherapy | targeted therapy for leukemia | CD3 T cell engager
Qin Lab - AI & Machine Learning
Larry Xie, Milpitas High School '27
Saahithi Srikanth, Monta Vista High School '27
Kimberly Yashar, The Harker School '26
Gabriela Formanek, Notre Dame High School '26
Seoyeon Kim, Valley Christian High School '26
Lavernie Chen, Santa Clara High School '28; Allyson Yu, BASIS '27
Department of Computer Science & Engineering
Influence of Chemical Etching on Twin Boundaries Dihedral Angle Measurements
Knowledge on interfacial free energies, or ratio of energies, of metals alloys is one the most sought after parameters in computational materials science and practical metallurgical applications. We propose the usage of an atomic force microscope (AFM) as a tool to evaluate the ratio of the twin boundaries to the surface free energy in copper. 3D printed models of twin boundaries were constructed on an atomic level scale. Heat treatment of "as received" copper samples was performed at 900º C and 800ºC for 1 hour to grow the copper's grains until it was suitable for observations. Metallurgically polished and etched samples were prepared in the ASDRP lab for optical, electron microscopy and AFM evaluations. We will discuss our results and future plans during the presentation.
RESEARCHERS: Larry Xie, Milpitas High School '27; Saahithi Srikanth, Monta Vista High School '27; Kimberly Yashar, The Harker School '26; Gabriela Formanek, Notre Dame High School '26; Seoyeon Kim, Valley Christian High School '26
ADVISOR: Starostina Lab, Materials Science
KEYWORDS: Grains l Twin Boundaries l Twinning Planes l Interfacial Free Energy l Microscopy l Copper l Fcc Metals
Department of Computer Science & Engineering
Toward Determination of Tabor Factor as a Function of Grain Size
Determining the Tabor factor in relation to microstructure and composition could pave the way for the creation and development of an inexpensive, non-destructive method for predicting the tensile properties of bulk materials using localized hardness measurements. This advancement is especially valuable for improving current preventative maintenance procedures and facilitating the upscaling of research and development in industrial settings.To start our research, we acquired CAD models and followed machine ASTM E8 standard tensile testing (TT) procedures for our copper samples. TT was performed at Santa Clara University, and five stress-strain (SS) diagrams were created and analyzed to determine their flow stress. Additionally, grain size was measured on both sides of the sample to be correlated alongside the flow stress. The microstructures and SS data will be shared and discussed in terms of the literature searches.
RESEARCHERS: Averi Mukhopadhyay, American High School '27; Ambar Vig, Los Altos High School '26; Nicholas Wong, Dougherty Valley High School '26; Anay Tailor, Dougherty Valley High School '27; Michael Tzeng, Mission San Jose High School '27; Tyler Buenaventura, Dougherty Valley High School '27
ADVISOR: Starostina Lab, Materials Science
KEYWORDS: Tabor Factor | Microstructure | Tensile Properties | Predictive Maintenance
Department of Computer Science & Engineering
Google Space Accountability Bot
Every semester, our lab faced the recurring challenge of holding students accountable for completing bi-weekly updates in a shared Google Sheet. While the process appeared straightforward, it often resulted in late-night manual reminders, repeated checks of the sheet, and unnecessary back-and-forth messages in Google Space. This manual oversight was inefficient. To address this, we developed an accountability chatbot integrated directly into Google Space, where students were already required to be active. The bot automates reminders, tracks completion, and enforces accountability through a transparent strike system. By embedding the chatbot into the existing communication platform, the system introduces no additional learning curve while ensuring consistent, automated accountability.
RESEARCHERS: Ayush Kansal, Irvington High School '26; Tithi Raval, Irvington High School '26
ADVISOR: Liu Lab, Software Engineering
KEYWORDS: Apps Script | Automation | Chatbot | Google Cloud | Python
Starostina Lab - Materials Science
Larry Xie, Milpitas High School '27
Saahithi Srikanth, Monta Vista High School '27
Kimberly Yashar, The Harker School '26
Gabriela Formanek, Notre Dame High School '26
Seoyeon Kim, Valley Christian High School '26
Starostina Lab - Materials Science
Averi Mukhopadhyay, American High School '27
Ambar Vig, Los Altos High School '26
Nicholas Wong, Dougherty Valley High School '26
Anay Tailor, Dougherty Valley High School '27
Michael Tzeng, Mission San Jose High School '27
Tyler Buenaventura, Dougherty Valley High School '27
Liu Lab - Software Engineering
Ayush Kansal, Irvington High School '26
Tithi Raval, Irvington High School '26