My research focuses on business analytics, consumer behavior, targeted marketing strategies, text analysis, and innovation.
I aim at advancing econometric methods tailored for business analytics, to uncover actionable insights and drive innovation in marketing and management practices. To this end, I have proposed robust and reliable methods to address endogeneity issues in causal inference such as correlated heterogeneity in consumer behaviors, omitted variables, self-selection, and unobserved latent factors, for large panel data models.
I also employ machine learning and text analysis to evaluate information provision effectiveness and integrate structured and unstructured data for better forecasting. The above approaches allow me to rigorously examine how products and ideas are marketed, how intellectual properties are managed, and their intersections with the dynamic economic environment.
Publication
Abstract: This paper considers a first-order autoregressive panel data model with individual-specific effects and heterogeneous autoregressive coefficients defined on the interval (-1,1], thus allowing for some of the individual processes to have unit roots. It proposes estimators for the moments of the cross-sectional distribution of the autoregressive (AR) coefficients, assuming a random coefficient model for the autoregressive coefficients without imposing any restrictions on the fixed effects. It is shown the standard generalized method of moments estimators obtained under homogeneous slopes are biased. Small sample properties of the proposed estimators are investigated by Monte Carlo experiments and compared with a number of alternatives, both under homogeneous and heterogeneous slopes. It is found that a simple moment estimator of the mean of heterogeneous AR coefficients performs very well even for moderate sample sizes, but to reliably estimate the variance of AR coefficients much larger samples are required. It is also required that the true value of this variance is not too close to zero. The utility of the heterogeneous approach is illustrated in the context of earnings dynamics.
JEL classifications: C22, C23, C36
Keywords: Earnings dynamics, heterogeneous dynamic panels, neglected heterogeneity bias, short T panels
Working Papers
Abstract: Endogeneity is a primary concern when evaluating causal effects using observational panel data. While unit-specific intercepts control for unobserved time-invariant confounders, dependence between (i) regressors (e.g., marketing mix strategy of interests) and the current error term (regressor endogeneity) and/or between (ii) regressors and heterogeneous slope coefficients (slope endogeneity) can introduce significant estimation bias, resulting in misleading inference. This paper proposes a two-stage copula endogeneity augmented mean group (2sCOPE-MG) estimator for panel data models, simultaneously addressing both endogeneity concerns. We generalize the IV-free copula control function, employing a general location Gaussian copula that effectively captures the panel structure. The heterogeneous coefficients are treated as unit-specific fixed parameters without distributional assumptions. Consequently, the 2sCOP-MG estimator allows for arbitrary dependence structure between heterogeneous coefficients and regressors. Unlike Haschka (2022), 2sCOPE-MG requires neither a normal error distribution nor a Gaussian copula regressor-error dependence structure and is more robust, easier to implement, and capable of addressing slope endogeneity. The 2sCOP-MG estimator is extended to dynamic panels, where intertemporal dependence in the outcome process can be suitably captured. We study its asymptotic properties and provide an analytical variance formula for inference without the need to bootstrap. For short dynamic panels, a Jackknife bias-corrected 2sCOP-MG estimator is provided to ensure unbiased inference. The usage of the 2sCOP-MG estimator is demonstrated by Monte Carlo simulations and a marketing mix response application across 21 categories to account for regressor and slope endogeneities in store-panel sales data.
Keywords: Control function, correlated random coefficients, Gaussian copula, heterogeneity, panel data, regressor endogeneity, slope endogeneity
Abstract: This paper investigates how information presentation affects Intellectual Property (IP) sales and prices in auctions. The authors examine how information salience and organization affect IP monetization by addressing two questions: (1) Does improved information presentation boost IP sales? (2) Does it influence the impacts of value drivers? To answer these questions, transaction data of 1,330 IPs sold by an auction house are used in a natural field experiment setup. IPs typically contain domain-specific text and face significant commercialization uncertainty. Therefore, the authors employ topic models to identify the most relevant information in auction catalogs that influences IP monetization. The paper concludes by discussing relevant implications for IP and auction managers.
Keywords: Information Presentation, intellectual properties, IP auctions, natural field experiment, valuation, text analysis, topic models
Abstract: Under correlated heterogeneity, the commonly used two-way fixed effects estimator is biased and can lead to misleading inference. This paper proposes a new trimmed mean group (TMG) estimator
which is consistent at the irregular rate of n^{1/3} even if the time dimension of the panel is as small as the number of its regressors. Extensions to panels with time effects are provided, and a Hausman-type test of correlated heterogeneity is proposed. Small sample properties of the TMG estimator (with and without time effects) are investigated by Monte Carlo experiments and shown to be satisfactory and perform better than other trimmed estimators proposed in the literature. The proposed test of correlated heterogeneity is also shown to have the correct size and satisfactory power. The utility of the TMG approach is illustrated with an empirical application.
JEL Classifications: C21, C23
Keywords: Correlated heterogeneity, irregular estimators, two-way fixed effects, FE-TE, tests of correlated heterogeneity, calorie demand
Abstract: This paper focuses on estimation and inference of the average effects in heterogeneous dynamic panel data models with weakly exogenous regressors when the number of cross-sectional units (n) is large and the number of time periods (T) is moderately short. We consider bias correction of mean group (MG) estimators by the split-panel Jackknife (JK). It is shown that the MG-JK estimator is root-n consistent as both n and T tend to infinity and n/T^{4} converges to zero once the first-order bias is eliminated. For the validity of the Jackknife in finite samples, a sufficient condition for the r-th moment existence of the MG estimator is derived for an ARX(1) model. Moreover, the paper revisits the long dispute over the disemployment effects of minimum wages. The MG-JK estimator addresses short-term dynamic heterogeneity in the outcome processes and treatment effects of continuous variables. In contrast to the two-way fixed effects estimator, it does not rely on the parallel trend assumption or a staggered treatment design. Using the MG-JK estimators and county-level data in the United States during 2002–2011, it is found that there is a close to zero effect of minimum wages on total employment but a substantial negative impact on teenage employment in the long run.
JEL classifications: C13, C22, C23, J38, J88
Keywords: Dynamic panels, individual heterogeneity, incidental parameter problem, mean group, bias reduction, split-panel Jackknife, minimum wage policy