Data augmentation methods have been shown to be a fundamental technique to improve generalization in tasks such as image, text and audio classification. Recently, automated augmentation methods have led to further improvements on image classification and object detection leading to state-of-the-art performances. Nevertheless, little work has been done on time-series data, an area that could greatly benefit from automated data augmentation given the usually limited size of the datasets. We present two sample-adaptive automatic weighting schemes for data augmentation: the first learns to weight the contribution of the augmented samples to the loss, and the second method selects a subset of transformations based on the ranking of the predicted training loss. We validate our proposed methods on a large, noisy financial dataset and on time-series datasets from the UCR archive. On the financial dataset, we show that the methods in combination with a trading strategy lead to improvements in annualized returns of over 50%, and on the time-series data we outperform state-of-the-art models on over half of the datasets, and achieve similar performance in accuracy on the others.
Deep Learning provided powerful tools for forecasting financial time series data. However, despite the success of these approaches on many challenging financial forecasting tasks, it is not always straightforward to employ DL-based approaches for highly volatile and non-stationary time financial series. To this end, we propose an adaptive input normalization layer that can learn to identify the distribution from which the input data were generated and then apply the most appropriate normalization scheme. This allows for promptly adapting the input to the subsequent DL model, which can be especially important, given recent findings that hint at the existence of critical learning periods in neural networks. Furthermore, the proposed method operates on a sliding window over the time series allowing for overcoming non-stationary issues that often arise. It is worth noting that the main difference with existing approaches is that the proposed method does not just learn to perform static normalization, e.g., using a fixed set of parameters, but instead it adaptively calculates the most appropriate normalization parameters, significantly improving the robustness of the proposed approach when distribution shifts occur. The effectiveness of the proposed formulation is verified using extensive experiments on three challenging financial time-series datasets.
We introduce a feasible and practical Bayesian method for unit root testing in financial time series. Specifically, we propose a convenient approximation of the Bayes factor in terms of the Bayesian Information Criterion as a straightforward and effective strategy for testing the unit root hypothesis. Our approximate approach relies on few assumptions, is of general applicability, and preserves a satisfactory error rate. Among its advantages, it does not require the prior distribution on model's parameters to be specified. Our simulation study and empirical application on real exchange rates show great accordance between the suggested simple approach and both Bayesian and non-Bayesian alternatives..
Stock classification is a challenging task due to high levels of noise and volatility of stocks returns. We show that using transfer learning can help with this task, by pre-training a model to extract universal features on the full universe of stocks of the S&P500 index and then transferring it to another model to directly learn a trading rule. Transferred models present more than double the risk-adjusted returns than their counterparts trained from zero. In addition, we propose the use of data augmentation on the feature space defined as the output of a pre-trained model (i.e. augmenting the aggregated time-series representation). We compare this augmentation approach with the standard one, i.e. augmenting the time-series in the input space. We show that augmentation methods on the feature space leads to 20% increase in risk-adjusted return compared to a model trained with transfer learning but without augmentation.
Data augmentation methods in combination with deep neural networks have been used extensively in computer vision on classification tasks, achieving great success; however, their use in time series classification is still at an early stage. This is even more so in the field of financial prediction, where data tends to be small, noisy and non-stationary. In this paper we evaluate several augmentation methods applied to stocks datasets using two state-of-the-art deep learning models. The results show that several augmentation methods significantly improve financial performance when used in combination with a trading strategy. For a relatively small dataset (≈30K samples), augmentation methods achieve up to 400% improvement in risk adjusted return performance; for a larger stock dataset (≈300K samples), results show up to 40% improvement.
Managing the prediction of metrics in high-frequency financial markets is a challenging task. An efficient way is by monitoring the dynamics of a limit order book to identify the information edge. This paper describes the first publicly available benchmark dataset of high-frequency limit order markets for mid-price prediction. We extracted normalized data representations of time series data for five stocks from the NASDAQ Nordic stock market for a time period of ten consecutive days, leading to a dataset of ~4,000,000 time series samples in total. A day-based anchored cross-validation experimental protocol is also provided that can be used as a benchmark for comparing the performance of state-of-the-art methodologies. Performance of baseline approaches are also provided to facilitate experimental comparisons. We expect that such a large-scale dataset can serve as a testbed for devising novel solutions of expert systems for high-frequency limit order book data analysis.
Check out our methodologies for time-series analysis (applied to Financial Data)
Financial time-series analysis and forecasting have been extensively studied over the past decades, yet still remain as a very challenging research topic. Since the financial market is inherently noisy and stochastic, a majority of financial time-series of interests are non-stationary, and often obtained from different modalities. This property presents great challenges and can significantly affect the performance of the subsequent analysis/forecasting steps. Recently, we proposed the Temporal Attention augmented Bilinear Layer (TABL) has shown great performances in tackling financial forecasting problems. By taking into account the nature of bilinear projections in TABL networks, we propose Bilinear Normalization (BiN), a simple, yet efficient normalization layer to be incorporated into TABL networks to tackle potential problems posed by non-stationarity and multimodalities in the input series. Our experiments using a large scale Limit Order Book (LOB) consisting of more than 4 million order events show that BiN-TABL outperforms TABL networks using other state-of-the-arts normalization schemes by a large margin.
Forecasting the movements of stock prices is one the most challenging problems in financial markets analysis. We have proposed many Machine Learning and Deep Learning methodologies for the prediction of future price movements using limit order book data. We have evaluated existing handcrafted features based on the raw order book data and features extracted by DL models.
We have proposed DL models for financial time-series classification that can operate efficiently while being competitive with standard DL models for time-series analysis formed by an enormous number of parameters. Description of our new models can be found here.
The existing literature provides evidence that limit order book data can be used to predict short-term price movements in stock markets. We propose a new neural network architecture for predicting return jump arrivals in equity markets with high-frequency limit order book data. This new architecture, based on Convolutional Long Short-Term Memory with Attention, is introduced to apply time series representation learning with memory and to focus the prediction attention on the most important features to improve performance. The data set consists of order book data on five liquid U.S. stocks. The use of the attention mechanism makes it possible to analyze the importance of the inclusion limit order book data and other input variables. By using this mechanism, we provide evidence that the use of limit order book data was found to improve the performance of the proposed model in jump prediction, either clearly or marginally, depending on the underlying stock. This suggests that path-dependence in limit order book markets is a stock specific feature. Moreover, we find that the proposed approach with an attention mechanism outperforms the multi-layer perceptron network as well as the convolutional neural network and Long Short-Term memory model.
The list provided in the following may be incomplete. The complete list of papers related to this topic can be found in the lists of journal papers and conference papers.
M. Magris and A. Iosifidis, “Approximate Bayes factors for unit root testing”, arXiv:2102.10048
N. Passalis, J. Kanniainen, M. Gabbouj, A. Iosifidis and A. Tefas, "Forecasting Financial Time Series using Robust Deep Adaptive Input Normalization", Journal of Signal Processing Systems, accepted December 2020
E. Fons, P. Dawson, X.J. Zeng, J. Keane and A. Iosifidis, “Adaptive Weighting Scheme for Automatic Time-Series Data Augmentation”, arXiv:2102.08310
E. Fons, P. Dawson, X.J. Zeng, J. Keane and A. Iosifidis, “Augmenting transferred representations for stock classification”, IEEE International Conference on Acoustics, Speech and Signal Processing, Toronto, Ontario Canada (Online), 2021
E. Fons, P. Dawson, X.J. Zeng, J. Keane and A. Iosifidis, “Evaluating data augmentation for financial time series classification”, arXiv:2010.15111
D.T. Tran, J. Kanniainen, M. Gabbouj and A. Iosifidis, “Data Normalization for Bilinear Structures in High-Frequency Financial Time-Series”, International Conference on Pattern Recognition, Milan, Italy (online), 2020
M. Shabani and A. Iosifidis, “Low-Rank Temporal Attention-Augmented Bilinear Network for financial time-series forecasting”, IEEE Symposium Series on Computational Intelligence, Canberra, Austrarila (online), 2020
N. Passalis, A. Tefas, J. Kanniainen, M. Gabbouj and A. Iosifidis, “Temporal Logistic Neural Bag-of-Features for Financial Time series Forecasting leveraging Limit Order Book Data”, Pattern Recognition Letters, vol.136, 183-189, 2020
A. Ntakaris, J. Kanniainen, M. Gabbouj and A. Iosifidis, “Mid-price Prediction Based on Machine Learning Methods with Technical and Quantitative Indicators”, PLoS ONE, e0234107, DOI: 10.1371/journal.pone.0234107, 2020
A. Tsantekidis, N. Passalis, A. Tefas, J. Kanniainen, M. Gabbouj and A. Iosifidis, “Using Deep Learning for price prediction by exploiting stationary limit order book features”, Applied Soft Computing, accepted May 2020
N. Passalis, A. Tefas, J. Kanniainen, M. Gabbouj and A. Iosifidis, “Deep Adaptive Input Normalization for Time Series Forecasting”, IEEE Transactions on Neural Networks and Learning Systems, accepted 2019
N. Passalis, A. Tefas, J. Kanniainen, M. Gabbouj and A. Iosifidis, “Adaptive Normalization for Forecasting Limit Order Book Data using Convolutional Neural Networks”, IEEE International Conference on Acoustics, Speech and Signal Processing, Barcelona, Spain, 2020
A. Ntakaris, G. Mirone, J. Kanniainen, M. Gabbouj and A. Iosifidis, "Feature Engineering for Mid-Price Prediction with Deep Learning", IEEE Access, accepted 2019
M. Makinen, J. Kanniainen, M. Gabbouj and A. Iosifidis, “Forecasting of Jump Arrivals in Stock Prices: New Attention-based Network Architecture using Limit Order Book Data”, Quantitative Finance, accepted 2019
A. Tsantekidis, P. Nousi, N. Passalis, A. Ntakaris, J. Kanniainen, A. Tefas, M. Gabbouj and A. Iosifidis, "Machine Learning for Forecasting Mid Price Movement using Limit Order Book Data", IEEE Access, vol. 7, pp. 64722-64736, 2019
N. Passalis, A. Tefas, J. Kanniainen, M. Gabbouj and A. Iosifidis, “Temporal Bag-of-Features Learning for Predicting Mid Price Movements using High Frequency Limit Order Book Data”, IEEE Transactions on Emerging Topics in Computational Intelligence, (Early Access) DOI: 10.1109/TETCI.2018.2872598, 2019
D.T. Tran, A. Iosifidis, J. Kanniainen and M. Gabbouj, “Temporal Attention augmented Bilinear Network for Financial Time-Series Data Analysis”, IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 5, pp. 1407-1418, 2019
A. Ntakaris, M. Magris, J. Kanniainen, M. Gabbouj and A. Iosifidis, “Benchmark Dataset for Mid-Price Prediction of Limit Order Book data”, Journal of Forecasting, DOI:10.1002/for.2543, 2018
N. Passalis, A. Tefas, J. Kanniainen, M. Gabbouj and A. Iosifidis, “Deep Temporal Logistic Bag-of-Features for Forecasting High Frequency Limit Order Book Time Series”, IEEE International Conference on Acoustics, Speech, and Signal Processing, Brighton, U.K., 2019
D.T. Tran, J. Kanniainen, M. Gabbouj and A. Iosifidis, “Data-driven Neural Architecture Learning for Financial Time-series Forecasting”, Digital Image and Signal Processing, Oxford, U.K. 2019
D.T. Tran, M. Magris, J. Kanniainen, M. Gabbouj and A. Iosifidis, “Tensor Representation in High-Frequency Financial Data for Price Change Prediction”, IEEE Symposium Series on Computational Intelligence, Hawaii, USA, 2017
A. Tsantekidis, N. Passalis, A. Tefas, J. Kanniainen, M. Gabbouj and A. Iosifidis, “Using Deep Learning to Detect Price Change Indications in Financial Markets”, European Signal Processing Conference, Kos, Greece, 2017
N. Passalis, A. Tsantekidis, A. Tefas, J. Kanniainen, M. Gabbouj and A. Iosifidis, “Time-series Classification using Neural Bag-of-Features”, European Signal Processing Conference, Kos, Greece, 2017
A. Tsentekidis, N. Passalis, A. Tefas, J. Kanniainen, M. Gabbouj and A. Iosifidis, “Forecasting Stock Prices from the Limit Order Book using Convolutional Neural Networks”, IEEE Conference on Business Informatics, Thessaloniki, Greece, 2017