Hello there! Thanks for stopping by.

I'm Hongjin (鸿瑾), a first-year Ph.D. student in Computer Science at Harvard University. I am fortunate to be advised by Krzysztof Gajos. I was previously a practitioner of AI for Social Good, but have recently transitioned to Human-Computer Interactions (HCI). I am currently exploring perceptions of model explainability for AI for Social Good practitioners.

I was born and raised in Guangzhou, China, and moved to the US for my undergraduate degree in Mathematics and Computer Science at Occidental College. I received my master’s degree in data science from the London School of Economics and Political Science. Before Harvard, I worked as a research fellow at Stanford Law School, where I developed and evaluated Machine Learning systems to help reduce water pollution, in partnership with the EPA. Most recently, I completed a 200hrs yoga training program in Indonesia and am super excited to teach yoga informally

I am deeply passionate about social impact. Outside of research, I have completed projects with nonprofits in China, US, UK, and Malawi and worked as data for development intern at UNDP. Ask me about that!

When I am away from my computer, I love to spend time in nature, with communities, dance, run, do yoga, and play the ukulele.  

Always happy to chat about research, grad school, passion projects, life journeys, and anything else! Feel free to drop me an email. 

GitHubLinkedInTwitterEmail

Recent Research Projects

Joint work with Esther Rolf, Sanket Shah, Benjamin ricce, James Hazen, Christopher Golden, and Milind Tambe

Detecting Micronutrient Deficiency via Satellite Images and Intervention Planning in Madagascar

Micronutrient Deficiency (MND) is a form of malnutrition that can lead to serious health consequences. MND is particularly prevalent and remains a critical public health issue in developing countries like Madagascar due to poor diet, and infectious disease, among other factors. Non-profit organizations provide the frontline defense with essential resources like nutrient supplements, school meals, agricultural interventions, and educational campaigns to address this global challenge. In Madagascar, Catholic Relief Services (CRS) seeks to improve the health and well-being of people in communities across the country, including efforts to reduce the prevalence of MND. To design effective interventions within budget constraints, we must first identify where high rates of MND are most likely to occur. Traditionally, MND detection relies on expensive, invasive, and often inaccessible blood tests and surveys. Previous works have shown promise that satellite-based predictions can provide relatively accurate estimates of MND rates across the populations studied. In this work, we aim to go beyond measuring prediction accuracy, and assess the degree to which satellite-based predictions of MND can be used to aid CRS in intervention decision-making with a technical called decision-focused learning.

Joint work with Matthew Nazari, Derek Zheng, and Davies Lab

Classifying tree species from 3D LiDAR images with vision transformer

Reliable large-scale data on the state of forests is crucial for monitoring ecosystem health, carbon stock, and the impact of climate change. Current knowledge of tree species distribution relies heavily on manual data collection, which often takes years to complete, resulting in limited datasets that cover only a small subset of the world’s forests. Recent works show that state-of-the-art deep learning models using Light Detection and Ranging (LiDAR) images enable accurate and scalable classification of tree species in various ecosystems. While LiDAR images contain rich 3-Dimensional (3D) information, most previous works flatten the 3D images into 2D projections in order to use Convolutional Neural Networks (CNNs). This project offers three significant contributions: 1) we apply the deep learning framework for tree classification in tropical savannas; 2) we use Airborne LiDAR images, which have a lower resolution but greater scalability than Terrestrial LiDAR images used in most previous works; 3) we introduce the approach of directly feeding 3D point cloud images into a vision transformer model (PCTreeS). Our results show that the PCTreeS approach outperforms current CNN baselines with 2D projections in AUC (0.81), overall accuracy (0.72), and training time (~ 45 mins). This project also motivates further LiDAR image collection and validation for accurate large-scale automatic classification of tree species.

Joint work with Pradeep Varakantham, Panayiotis Danassis, Abheek Ghosh, and Milind Tambe

Improving the mean-field method for solving Restless Multi-Armed Bandits problems

The Whittle index policy is a popular solution to Restless Multi-Armed bandit (RMAB) problems for its simplicity and near-optimality under certain conditions. However, Whittle-index-based policies require many assumptions that are hard to verify, including homogeneity, indexability, irreducibility, infinite-horizon average reward, and global attractor properties. Even with these assumptions met, the Whittle index policy can still yield sub-optimal performance. A few recent works proposed other ways to solve RMAB problems without using the Whittle Index policies. Ghosh et al. 2023 provided an improved and near-optimal algorithm that relies on mean-field limits, which capitalizes on the effect of the "law of large numbers" to approximate the stochastic process of clusters of arms. While the authors showed that the mean-field method outperforms Whittle index policies in several public health domains, the mean-field method takes much longer than Whittle, mainly due to the need to solve an LP at each time step. This project seeks to answer 1) how can we improve the scalability of the mean-field method, especially in the infinite horizon setting? and 2) how can we improve the solution, e.g., by establishing the mean-field method in the infinite horizon setting (so we don't need to solve the LP at each time step) or using a combination of the mean-field and Whittle method? 

Publications