Machine Learning

The spark, in 2009, that led to the creation of our team one year later, was the common interest of Dr. Gogas (economist) and Dr. Papadimitriou (mathematician/electrical engineer) to fuse methodologies from other fields to pure economic applications. Our initial attempt was to forecast financial time series using Machine Learning techniques and specifically Support Vector Machines for Classification and Regression.

Support Vector Machines (SVM) is a binary supervised machine learning classifier. Proposed by Cortes and Vapnik (1995) the basic notion behind the method is to find a linear separator (a hyperplane) of the two classes in the so called feature space. With the use of a kernel function the original data space is projected to a higher dimensional space called the feature space. Solving a convex minimization problem the algorithm converges to a linear separator that has the largest margin between the two classes. The margin between the classes is defined by a set of data points called Support Vectors. An example of an SVM classification employing the RBF kernel is depicted in Figure 6.

Figure 6: An example of an SVM classification using the RBF kernel. The two classes are separated with a linear separator at a higher dimensional space called the feature space. The separator in the feature space when it is re-projected back into the original dimensions becomes a non-linear function of the initial data space. The encircled observations are the Support Vectors defining the decision boundary and the observations with a square rounding are misclassified observations.

Application Range

Forecasting Bank Failures

We explored the problem of forecasting bank failures using the SVM methodology with great success. We used a dataset that includes 1443 U.S. banks and consists of 962 solvent banks and 481 banks that failed during the period 2003-2013. We start with an initial set of 144 variables that come from the publicly available financial statements of the banks. Next, we employ the variable selection methodology called Local Learning and we select only 5 variables out of the original 144. These 5 variables are then used as the explanatory variables to train the optimum SVM model. This produced an out-of-sample forecasting accuracy of 97.67% in a one-period-ahead (one year) forecasting window. With a forecasting window of two and three years ahead the model’s accuracy fell to 83.48% and 64.35% respectively. The out-of-sample forecasting accuracy for the failed banks was 97.39% and for the solvent banks 97.81%. These results are summarized in Table 2.

As a follow up, we investigated the three missed insolvent cases: a) In the case of The First National bank of Davis, Davis, Oklahoma, during the 2011 examination, significant unrecognized losses were discovered that exceeded the bank’s capital and allowance for loan and lease losses, b) The First State Bank received an enforcement action on 21st of December 2010 and in a very short time, on the 28th of January 2011 it was closed and c) The Community's Bank was established in 2001 and in 2013 it failed. The same investigation for the five misclassified solvent case revealed that the financial institutions received an enforcement action from FDIC or financial help from the U.S. Treasury.

Stress Testing Tool

Moreover, we propose the use of the previously defined hyperplane separating the solvent from the insolvent banks as a new form of bank stress testing. Solvent banks that are close to the separating hyperplane are more vulnerable to class changing (become insolvent). Therefore, a solvent bank’s distance from the separating hyperplane may be used as a measure of its robustness. By doing this, multi-level stress tests can be performed: a) system-wide, where alternative scenarios on various macroeconomic and/or financial indicators are reflected on the five explanatory variables of the banks’ financial statements that are used for the classification and b) institute specific based on their financial condition. Moreover, for a bank that that is forecasted to fail within the next year, the distance from the hyperplane may be exploited and render it solvent. This distance can be used in time in order to prescribe a set of measures that will render the bank solvent.

Our proposition is also equipped with a sensitivity analysis of each bank in distress to the five variables that define its position in the data space.

Forecasting in Energy Markets

"Forecasting Energy Markets using Support Vector Machines ", Energy Economics, 44, pp. 135-142, July 2014.

The efficiency of an SVM-based forecasting model for the next-day directional change of electricity prices that are very volatile was also investigated. We first fitted the best autoregressive SVM model and then augmented it with various relevant variables. We tested this model on the daily Phelix index of the German and Austrian control area of the European Energy Exchange (ΕΕΧ) wholesale electricity market. The out-of-sample forecasting accuracy we achieved is 76.12% over a 200 day period.

We further explored the efficiency of the SVM methodology in the electricity market by forecasting "price spikes”. In electricity markets, the comparatively high upward or downward movements of prices within a short period are called spikes. In effect, when we are trying to forecast price spikes in electricity, we perform outlier detection. The problem was attacked using Multiclass SVM and Deep Learning architectures. The forecasting accuracy of the optimal model is depicted in Table 3.

Forecasting Exchange Rates

"Forecasting Daily and Monthly Exchange Rates with Machine Learning Techniques", Journal of Forecasting, forthcoming.

We tested and compared the forecasting ability of various Machine Learning, Neural Networks, and Econometrics architectures on monthly and daily spot prices of five selected exchange rates: EUR/USD, JPY/USD, NOK/AUD, NZD/BRL and PHP/ZAR. In doing so, we combine a novel smoothing technique (initially applied in signal processing) with a variable selection methodology and the regressors are used in the regression estimation methodologies. After the decomposition of the original exchange rate series using the Ensemble Empirical Mode Decomposition (EEMD) method into a smoothed and a fluctuating component, Multivariate Adaptive Regression Splines (MARS) are used to select the most appropriate variable set from a very large set of explanatory variables that we collected. The selected variables are then fed into the forecasting models that produce one-period-ahead forecasts for the two components: smoothed and fluctuating.

We implemented two versions of this hybrid methodological setup; an autoregressive and a structural one. The autoregressive model consistently outperforms all alternative models in forecasting out-of-sample in monthly frequency. Using daily data, the structural EEMD-MARS-SVR model is superior for three out of the five exchange rates in our sample. A close alternative is the structural EEMD-MARS-NN (Neural Networks) model that outperforms in forecasting the other two exchange rates. Nevertheless, only the structural EEMD-MARS-SVR methodology consistently outperforms the benchmark random walk model in out-of-sample forecasting. The above findings corroborate with the microstructural aspect of the exchange rate market on the short run and the importance of macroeconomic dynamics (such as PPP, UIP, etc.) on the long run, validating exchange rate theory. Overall, our models seem to capture the different data generating processes between short and long horizons. The weak form of exchange rate market efficiency is rejected for both sampling frequencies due to the fact that the autoregressive model outperforms the RW benchmark.

Yield Curve and Recession Forecasting

"Yield Curve Point Triplets and Recession Forecasting", International Finance, forthcoming.

Several studies have highlighted the yield curve’s ability to forecast economic activity. These studies use the information provided by the slope of the yield curve—i.e., pairs of short- and long-term interest rates. We constructed three models to forecast the positive and negative deviations of real U.S. GDP from its long-run trend over the period from 1976Q3 to 2011Q4: one that uses only pairs of interest rates and two that draw on more than two points from the yield curve. We employ two alternative forecasting methodologies: the probit model, which is commonly used in this line of literature, and the support vector machines (SVM) approach from the area of machine learning. Our empirical results show that the SVM model with the RBF kernel and three interest rates as input variables achieves a 100% out-of-sample forecasting accuracy for recessions (unemployment gaps) and the best overall accuracy is 80%. Thus, it appears that correct identification of upcoming unemployment gaps may be achieved at the cost of reduced accuracy of forecasting inflationary gaps or in other words at the cost of some extra inflation. The interest rates that produce these results are the 3-month T-bill rate and the 2- and 3-year government bond rates. Long-term rates do not appear to complement our models’ forecasting ability. Our interpretation of this finding is that, since the U.S. is a developed industrialized country, the United States is considered to have a stable long-term economic outlook that is not affected by short-term dynamics and fluctuations and instead adheres to its long-run potential output. Thus, agents’ views of future economic activity are not significantly affected by short-term events or fluctuations, thus rendering the long-term rates uninformative in terms of short-term recession forecasting.

Google Sites

Report abuse