Published Papers:
Margaritella, L., Westerlund, J., "Estimating Aggregate Relationships in Panel Data via the LASSO", 2025, Oxford Bulletin of Economics and Statistics, Accepted/In Press
Abstract:
This paper is concerned with the estimation of aggregate relationships among a potentially very large number of panel data variables in the presence of unobserved heterogeneity in the form of interactive effects, an empirically very relevant scenario that has not been considered before. One of our findings is that if the regressors load on the same set of latent factors as the dependent variable, which seems a priori likely since many variables are co-moving, the aggregation automatically accounts for the unobserved heterogeneity. In order to also account for the many regressors, the aggregate model is estimated using LASSO, leading to the ``Cross-sectionally Averaged aDAptive LASSO'' (CADA-LASSO). It is shown that under suitable regulatory conditions, the new estimator is oracle efficient and selection consistent, properties that are verified in small samples using Monte Carlo simulations. The empirical usefulness of the estimator is illustrated using as an example the gravity equation of trade.
Krampe, J., Margaritella, L., "Factor Models with Sparse Vector Autoregressive Idiosyncratic Components", 2025, Oxford Bulletin of Economics and Statistics, 87 (4), 837-849
Abstract:
We reconcile dense and sparse modeling by exploiting the positive aspects of both. We employ a high-dimensional, approximate static factor model and assume the idiosyncratic term follows a sparse vector autoregressive model (VAR). The estimation is articulated in two steps: (i) factors and loadings are estimated via principal component analysis (PCA); (ii) a sparse VAR is estimated via the lasso on the estimated idiosyncratic components from (i). Step (ii) allows to model cross-sectional and time dependence left after the factors estimation. We prove the consistency of this approach as the time and cross-sectional dimensions diverge. In (ii), sparsity is allowed to be very general: approximate, row-wise and growing with the sample size. However, the estimation error of (i) needs to be accounted for. Instead of simply plugging-in the standard rates derived for the PCA estimation of the factors in (i), we derive a refined expression of the error which enables to derive tighter rates for the lasso in (ii). We discuss applications on forecasting \& factor augmented regression and present an empirical application on macroeconomic forecasting using the Federal Reserve Economic Data - Monthly Database (FRED-MD).
Margaritella, L., Sessinou, R., "Precision Least Squares: Estimation and Inference in High-Dimensions ", 2024, Journal of Business and Economics Statistics, Accepted/In Press
Abstract:
The least squares estimator can be cast as depending only on the precision matrix of the data, similar to the weights of a global minimum variance portfolio. We give conditions under which any plug-in precision matrix estimator produces an unbiased and consistent least squares estimator for stationary time series regressions, in both low- and high-dimensional settings. Such conditions define a class of “Precision Least Squares” (PrLS) estimators, which are shown to be approximately Gaussian, efficient, and to provide automatic family-wise error control in large samples. For estimating high-dimensional sparse regression models, we propose a LASSO Cholesky estimator of the plug-in precision matrix. We show its consistency and how to properly bias correct it, thereby obtaining a LASSO Cholesky-based PrLS (LC-PrLS) estimator. LC-PrLS performs well in finite samples and better than state-of-the-art high-dimensional estimators. We employ LC-PrLS to investigate the dynamic network of predictive connections among a large set of global bank stock returns. We find that crisis years correspond to a collapse of predictive linkages.
Margaritella, L., Westerlund, J.,"Using Information Criteria to Select Averages in CCE", 2023, The Econometrics Journal, 26 (3), 405-421
Abstract:
In the interactive effects panel data literature information criteria are commonly used to consistently determine which of the estimated principal components factors to include. The present paper shows that the same approach can be applied to factors estimated by taking the cross-sectional averages of the observables, as prescribed by the popular common correlated effects (CCE) approach. This should be useful to practitioners, because at the moment there is no other theory that justifies the use of information criteria in CCE.
Hecq, A., Margaritella, L., Smeekes, S.,"Granger Causality Testing in High-Dimensional VARs: a Post-Double-Selection Procedure", 2023, Journal of Financial Econometrics, 21 (3), 915-958 [ Data ]
Abstract:
We develop an LM test for Granger causality in high-dimensional VAR models based on penalized least squares estimations. To obtain a test retaining the appropriate size after the variable selection done by the lasso, we propose a post-double-selection procedure to partial out effects of nuisance variables and establish its uniform asymptotic validity. We conduct an extensive set of Monte-Carlo simulations that show our tests perform well under different data generating processes, even without sparsity. We apply our testing procedure to find networks of volatility spillovers and we find evidence that causal relationships become clearer in high-dimensional compared to standard low-dimensional VARs.
Discussion Papers:
Margaritella, L., Stauskas, O., "New Tests of Equal Forecast Accuracy for Factor-Augmented Regressions with Weaker Loadings", 2024 (Click here for the arXiv version) R&R
Abstract:
We provide the theoretical foundation for the recently proposed tests of equal forecast accuracy and encompassing by Pitarakis (2023a) and Pitarakis (2023b), when the competing forecast specification is that of a factor-augmented regression model, whose loadings are allowed to be homogeneously/heterogeneously weak. This should be of interest for practitioners, as at the moment there is no theory available to justify the use of these simple and powerful tests in such context.
Abstract:
We revisit the problem of estimating high-dimensional global bank network connectedness. Instead of directly regularizing the high-dimensional vector of realized volatilities as in Demirer et al. (2018), we estimate a dynamic factor model with sparse VAR idiosyncratic components. This allows to disentangle: (I) the part of system-wide connectedness (SWC) due to the common component shocks (what we call the "banking market"), and (II) the part due to the idiosyncratic shocks (the single banks). We employ both the original dataset as in Demirer et al. (2018) (daily data, 2003-2013), as well as a more recent vintage (2014-2023). For both, we compute SWC due to (I), (II), (I+II) and provide bootstrap confidence bands. In accordance with the literature, we find SWC to spike during global crises. However, our method minimizes the risk of SWC underestimation in high-dimensional datasets where episodes of systemic risk can be both pervasive and idiosyncratic. In fact, we are able to disentangle how in normal times ≈60-80% of SWC is due to idiosyncratic variation and only ≈20-40% to market variation. However, in crises periods such as the 2008 financial crisis and the Covid19 outbreak in 2019, the situation is completely reversed: SWC is comparatively more driven by a market dynamic and less by an idiosyncratic one.
Margaritella, L., Smeekes, S., Friedrich, M., "High-Dimensional Causality for Climatic Attribution", 2023 (Click here for the arXiv version) (R Scripts and Data )
Abstract:
In this paper we test for Granger causality in high-dimensional vector autoregressive models (VARs) to disentangle and interpret the complex causal chains linking radiative forcings and global temperatures. By allowing for high dimensionality in the model we can enrich the infor- mation set with all relevant natural and anthropogenic forcing variables to obtain reliable causal relations. These variables have mostly been investigated in an aggregated form or in separate models in the previous literature. Additionally, our framework allows to ignore the order of inte- gration of the variables and to directly estimate the VAR in levels, thus avoiding accumulating biases coming from unit-root and cointegration tests. This is of particular appeal for climate time series which are well known to contain stochastic trends as well as yielding long memory. We are thus able to display the causal networks linking radiative forcings to global temperatures but also to causally connect radiative forcings among themselves, therefore allowing for a careful reconstruction of a timeline of causal effects among forcings. The robustness of our proposed procedure makes it an important tool for policy evaluation in tackling global climate change.
Hecq, A., Margaritella, L., Smeekes, S., "Inference in Non-stationary High-Dimensional VARs", 2023 (Click here for the arXiv version) (R package: HDGCvar )
Abstract:
In this paper we construct an inferential procedure for Granger causality in high-dimensional non-stationary vector autoregressive (VAR) models. Our method does not require knowledge of the order of integration of the time series under consideration. We augment the VAR with at least as many lags as the suspected maximum order of integration, an approach which has been proven to be robust against the presence of unit roots in low dimensions. We prove that we can restrict the augmentation to only the variables of interest for the testing, thereby making the approach suitable for high dimensions. We combine this lag augmentation with a post-double-selection procedure in which a set of initial penalized regressions is performed to select the relevant variables for both the Granger causing and caused variables. We then establish uniform asymptotic normality of a second-stage regression involving only the selected variables. Finite sample simulations show good performance, an application to investigate the (predictive) causes and effects of economic uncertainty illustrates the need to allow for unknown orders of integration.
Current Research Projects:
Margaritella, L., Margaritella, N., "High-dimensional Lag-Length Selection: a Bayesian functional PCA approach."
Cubadda, G., Margaritella, L., Prifti, O., "Broken Adaptive Ridge Regression in Time Series: the TS-BAR"
R package: HDGCvar is an R package that allows for testing Granger Causality in high-dimensional stationary/non-stationary VAR models. ( check the latest version on my GitHub)
Software: I am an R user (you can check my GitHub Repo here). For teaching and during my studies I made often use of: Python, Matlab, Stata, SAS, EViews, SPSS.
Grants:
Basic Research Fund (BI NBS, joint with O. Stauskas)
Knut och Alice Wallenbergs stiftelse (REh2024-0001)
Stiftelsen Landshövding Per Westlings minnesfond (Reh2022-0010; REh2024-0001)
Research in Pairs Grant (London Mathematical Society LMS), 19/06/2022-26/06/2022, joint with N. Margaritella
Berge Stipendiat, 18/05/2022.