National University of Singapore

Department of Industrial Systems Engineering & Management

B.Eng(ISE) Independent Study Module (2022/2023 Semester II)

Machine Learning on Straits Times Index using Time Series and Ensemble

Tey Wee Wee

Abstract

The Covid epidemic has drawn more attention to the stock market over the past few months. Many people are buying and selling stocks without any knowledge of the subject at issue as a result of their increased free time. Over the previous year, the number of affiliations on trading or investing apps has dramatically expanded. It makes sense to assume that the field of stockĀ 

market forecasting has expanded in tandem. In particular, machine learning has been extensively studied for its potential and effectiveness in predicting the performance of the stock markets. However, the Straits Times Index, which is a market capitalization-weighted stock index that monitors the performance of the top 30 firms listed on the Singapore Exchange is poorly studied.


The goal of this project is to compare the accuracy of different machine learning algorithms such as the traditional machine learning method and deep learning in forecasting the Straits Times Index close price. In addition, time series using features and past historical Straits Times Index close price is also explored in this project. Machine learning ensemble method is also studied to observe its impact in predicting the close price of Straits Times Index.


After extensive research, the results obtained from this study shows that Linear Regression with Time Series using Features performed the best as it has the lowest Root Mean Squared Error and Mean Absolute Error when forecasting the close price of Straits Times Index. Further analysis also shows that, in general, time series using features performs better than time series using close of Straits Times Index. However, when compared to models without using time series, models that use time series with features perform worse. When two models that performed the best were selected for the machine learning ensemble, namely linear regression and ridge regression without time series, the results obtained showed an incredible improvement in the Mean Absolute Error while the Root Mean Squared Error remains similar as the individual models. Hence, simple is better and ensemble helps to improve the accuracy of machine learning models significantly.


Other than using deep learning, time series and ensemble, other machine learning methods such as sentimental analyses can be applied to forecast the stock market index. The stock market index is heavily influenced by human sentimentalism and the news published on top of the quantitative numbers. Research has been done to predict the other stock market index using sentiment analysis and text mining. Thus, text mining of news reports published may be added to the ensemble method as a combination of quantitative and qualitative information might be able to improve the accuracy of forecasting the STI.