Zhe Zhang
Assistant Professor of Business Analytics and Technology
Rady School of Management
University of California San Diego (UCSD)
Current Research
I am an assistant professor at University of California San Diego's (UCSD) Rady School of Management. I am within the Innovation, Technology, and Operations (ITO) group.
I completed my Ph.D. at Carnegie Mellon University at the Heinz College School of Information Systems and Management in Pittsburgh, PA, USA; and my undergraduate work in Economics and Statistics from Stanford University.
My research agenda is on the spillover, broader, societal impacts of information technology.
This is represented by two areas of work: one on fairness and implications of data-driven and algorithmic decision making, and one on the societal and spillover effects of digital transformation.
Spillover and Broader Effects of Digital Transformation
"Business Models in the Sharing Economy: Manufacturing durable goods in the presence of Peer-to-Peer rental markets" (joint work with Vibhanshu Abhishek and Jose Guajardo)
Information Systems Research (2021):
https://pubsonline.informs.org/doi/10.1287/isre.2021.1034Finalist for Best Student Paper in Supply Chain Management (POMS 2017)
Consider a manufacturer of durable goods, such as Toyota for cars. The growth of the sharing economy and digital innovation, allowed consumers who purchased from Toyota to engage in peer-to-peer (P2P) sharing of those cars, but also allowed potential consumers to use peer-rented cars instead of purchasing. In this paper, we analyze the strategic implications of P2P rental markets for such manufacturers. We find a surprising analytical insight. When consumers are heterogeneous in their usage frequencies, they have differing willingness-to-pay for the good. However, P2P sharing has an equalizing effect on this willingness-to-pay. Thus, P2P sharing can increase the monopoly-pricing capability of a manufacturer. In this paper, we explore the implications of this finding, and suggest business models under different settings.
"Cashierless Retail Store Operations and Its Impact on Consumer Consumption Demand"
Under revision, joint work with Hyoduk Shin and Derek Holl.
The proliferation of cashierless retail convenience stores, particularly in high-traffic locations such as airports and sporting events, is becoming increasingly prominent. A notable implementation of this technology is Amazon's Just Walk Out (JWO) system. This study investigates the impact of JWO technology on consumer demand. The anticipation of no checkout queues, facilitated by this technology, has the potential to alter consumer purchasing patterns. Utilizing a novel dataset that captures consumer behavior before and after the introduction of JWO technology at a major U.S. university, this paper examines its effects on students' consumption patterns. The study is conducted across five convenience stores within the campus, with only three implementing the JWO technology. The findings indicate a significant increase in peak throughput for the convenience stores equipped with JWO. Additionally, the timing of consumer demand shifts noticeably, with students frequenting the stores more often before and in-between classes rather than post-class. The introduction of cashierless technology also influences what items students choose to buy, with an increased purchase proportions of energy and protein bars, and candy — items suitable for quick consumption before or between classes. There is no consistent decrease in any particular product category within students' purchase baskets. These results suggest that managers adopting cashierless technology should consider not only its operational benefits for retail staffing but also its potential to influence consumer behavior and demand patterns.
"The Spillover Effects of Amazon Prime on Online Retail Spending"
Under revision, joint work with Xiaofeng Liu and Kevin Zhu.
In this project, we are interested in what happens to a consumer's consumption patterns after they adopt Amazon Prime. Using a novel dataset on consumer debit and credit card transactions over a several year period, we are able to identify new Prime adopters, as well as how their consumption changes in other, non-Amazon places after Prime adoption. This includes other online retailers, small versus large online retailers, as well as offline retail consumption as well. In this paper, we also address potential self-selection concerns with the timing of Prime adoption. With the FTC's 2023 lawsuit against Amazon and Prime specifically, this paper's findings and novel empirical findings have meaningful policy and managerial insights.
"Ridehailing's Effects on Offline Consumption"
Under revision, joint work with Beibei Li.
Uber and Lyft expanded across many US metropolitan areas between 2013 and 2015. We study the entry of Uber and Lyft into several of these metropolitan areas. Uber and Lyft have effected several parts of the economy, affecting both transportation patterns, tourism, and gig work. Using novel data at the individual-level, we provide novel evidence for how ridehailing services may have reshaped offline consumption among locals. We study the early adopters of ridehailing services, and show how this not only increases their local consumption, but also if this differs across categories, spatially, and along consumer demographics as well.
Fairness and Managerial Implications of Data-Driven Decision-Making and Machine Learning
"Imputation Strategies Under Clinical Presence: Impact on Algorithmic Fairness"
Under revision, joint work with Vincent Jeanselme, Maria DeArteaga, Brian D.M. Tom, and Jessica Barrett.
Initial version published at ML 4 Health (ML4H) Proceedings at NeurIPS 2023
Machine learning risks reinforcing biases present in data, and, as we argue in this work, in what is absent from data. In healthcare, biases have marked medical history, leading to unequal care affecting marginalised groups. Patterns in missing data often reflect these group discrepancies, but the algorithmic fairness implications of group-specific missingness are not well understood. In this paper, we contribute the first, to the best of our knowledge, thorough investigation of the relationship between the choice of imputation method and algorithmic fairness. The analysis consists of multiple parts: a review of real historical cases of missing data in healthcare settings and structure these into 3 scenarios, a theoretical analysis with novel insights on the role of imputation, and both simulation and real-world evidence for how imputation can significantly affect group fairness outcomes. An important finding is that in many meaningful cases and settings, group-specific imputation, while commonly recommended, can actually be harmful for group-fairness concerns.
"Strategic Contestants, Over-Fitting, and the Role of Data Splitting in Data Science Contests"
Under revision, joint work with Ping-Chieh Huang and Sanjiv Erat.
Invited talk presented at Theory in Economics and Information Systems (TEIS) 2024.
Companies organize data science contests to source innovative machine learning solutions for business operations. Using contests hosted on Kaggle as a motivating example, we formulate a model of data science contests where the organizer splits the data into a training part that is provided to contestants to build candidate models, and a testing part that is reserved privately to evaluate the submitted models. Previous literature in statistics suggests using intermediate data splits to strike a balance between having enough training data and being able to reliably rank models. However, by considering how participants choose their modeling approaches and the incentives they face in data science contests, we identify a new third effect of the split ratio. When intermediate splits are used — where the firm's evaluation of model performance is most reliable — contestants actually have the strongest incentive to submit "sub-optimal" models with higher variability in order to gain an advantage in the contest. An empirical examination of the past five years of Kaggle contests shows results that are consistent with our theoretical findings. Thus, an immediate implication of our model is that such intermediate split ratios advocated in prior statistics literature might result in the firm accurately choosing from a set of inaccurate submitted models.
"Identifying Significant Predictive Bias in Classifiers"
Joint work with Daniel Neill.
Published in the FAccT conference 2017 (formerly known as FAT ML conference).
This work is motivated by the increasing use of data-driven classifiers and risk assessment models for decision-making in various public and private sectors. Beyond overall performance assessment of these models, it's important that we identify if there are subpopulations or subgroups where such models may be over- or under-estimating the probability. With exponentially many such groups though, this can be a difficult problem. Our method enables such efficient and best-performing identification, providing both a way to audit the use of such models and improve them.
Background / Contact
Ph.D. Carnegie Mellon University, Heinz College. Information Systems and Management.
Jointly advised by Professors Daniel Neill, Beibei Li, and Vibhanshu Abhishek
B.S. Stanford University, Mathematical and Computational Sciences (MCS)
B.A. Stanford University, Economics (with honors).
In my undergraduate, I studied economics and policy, particularly around climate policy and energy markets. After my undergraduate, I spent some time continuing economics research on coal markets and the behavioral impact of pricing structure in electricity, and also spent time as a researcher at the Natural Resources Defense Council (NRDC) and Union of Concerned Scientists (UCS) nonprofits. I came to Carnegie Mellon to pursue my PhD using multiple disciplines including computer science and economics.
During my PhD, I have spent time as a part-time Data Creative staff at DataKind in NYC, NY, providing data science project brainstorming for potential partner organizations. I also was a 2016 Fellow at the Data Science for Social Good Summer Fellowship (DSSG) in Chicago, IL. There I worked on an interpretable prediction project for education support services.