About Me


My Favourite Quote: 

"Look again at that dot. That's here. That's home. That's us. On it everyone you love, everyone you know, everyone you ever heard of, every human being who ever was, lived out their lives. The aggregate of our joy and suffering, on a mote of dust suspended in a sunbeam. "

                                                                                                                                                              ― Carl Sagan, Pale Blue Dot

TECHNICAL EXPERIENCE AND PROJECTS


Graduate Student Researcher, UCSB                                                                                                  


High-Impact Healthcare Projects


Project: Early detection and management of COVID-19 in dialysis facilities (NIH Grant Funded)

·  Built data pipelines from raw patient-level clinical data for early detection and online diagnostics of COVID-19 in real time via an ensemble of machine learning and deep learning methods (e.g., SVM, XGBoost, random forest, semi-supervised clustering, RNN, state space models).

·  Proposed penalization to enhance computational efficiency and investigated theoretical properties.

·  Innovated existing likelihood computation algorithms to accommodate missing values in clinical records.

 


Project: Joint modeling of longitudinal processes and mortality for dialysis

·  Cleaned large-scale health data with SQL, created prediction framework with C++ for time-to-event data.

·  Optimized computational cost and speed in R for automated EHR processing, launched parallel computing on UNIX clusters to maintain timelines.

·  Presented at the Census Bureau workshop and three academic meetings, manuscript under journal review.

 


Project: Circular functional classification of OCT data for retinal structural phenotypes.                  

·  Constructed data pipelines for unsupervised classification and visualization with data fusion for circular functional data from optical coherence tomography to identify retinal structural phenotypes.

 


Project: Memory Loss in Alzheimer’s Disease Prevention via statistical learning      

·  Formulated multivariate analytical tools to identify risk factors and predict onset, clustered biomarkers via supervised and unsupervised learning, research was presented and accepted for peer-reviewed publication.


Selected Inter-disciplinary Studies


Project: Impact of COVID-19 on housing security for young adults in the U.S.                   

·  Developed prototype applications to recommend emergency relief strategy decisions at different stages of the pandemic via online prediction of housing crisis risks using state space models on market data.

·   Awarded for research impact at the Joint Statistical Meetings, paper under review for publication.

 

Project: Multivariate statistical learning of U.S. household income and expenditure      

·  Interpreted large national economics surveys with unsupervised machine learning (sparse PCA, cluster analysis, canonical correlation analysis), research was presented externally and undergoing journal review.

 

Project: Research performance of the new energy vehicle industry post COVID-19      

·  Collected and processed individual corporate data using SQL, evaluated multivariate longitudinal variables with mixed effect state space models, study was presented and currently under review.

 

Project: Impact of Sea Surface Temperature on Fishing Effort in US Exclusive Economic Zone      

· Established dynamic spatiotemporal modeling for satellite imaging data, improved Bayesian procedures for efficient computations, research resulted in contributed talk and collaborative journal publication.


Statistics Consultant, UW Madison (Sponsored by NSF)                                                                                    

·  Analyzed natural language with Latent Dirichlet Allocation in R, consulted regularly for a non-technical audience on emerging business challenges and data needs.


INDUSTRY EXPERIENCE


New Market Data Science Intern, Dexcom, Inc.                                                                                                   

·  Implemented mixed effects dynamic methods for in vivo analysis, detecting medication interference on glucose monitor accuracy from electronic health records to supplement in vitro bench-testing experiments.

·  Discovered dynamic decision thresholds for end points in preclinical studies based on medical literature.

·  Designed clinical trials to evaluate sensitivity of specialized medicine and to assess robustness of results.

·  Interpreted correlations and predictions from analysis of biomarkers and clinical signals from multiple sources for collaborative projects, internal inter-disciplinary meetings and external reviews.

·  Provided statistical support independently, prepared reports and presentations for upcoming regulatory submissions to support corporate business expansions into new hospital markets for diabetes treatment.

·  Proactively addressed unbalanced measurements in modeling and prepared a statistical analysis plan.


Statistics Researcher, Fresenius Medical Care North America                                                                            

· Processed, visualized and analyzed large multimodal clinical data, reviewed data integrity, prepared NIH public data submissions and reports in accordance with committee regulations and company standards.

· Engineered translational study design for changepoint detection using wearables health device data and clinical studies, formulated data-driven solutions to inform standard clinical practice.

·  Devised meta-analysis of COVID vaccination efficacy among dialysis patients via cross-functional cooperation with algorithm engineers and the medical faculty.


COMPUTING SKILLS

 

Programming Languages: R, SQL, SAS, Python, C++, NONMEM.

Software & Platforms:  Microsoft Suite, G-suite, Git, LATEX, UNIX cluster computing.