John Galpin

Event Driven Automated FOREX Trading

Phase 1: Introduction

There are various asset types that someone could invest in. Asset classes traditionally include equities (stocks), fixed income (bonds), and cash equivalent or money market instruments. Modern financial professionals include real estate, commodities, financial derivatives, and even cryptocurrencies. All come with different assumptions and structure which is why people tend to specialize within asset classes in their day jobs. Foreign exchange(FOREX) is one of the basic ways people can move their money around the world. Not all currencies are valued the same. Currency exchange rates change over time. If you exchange currency and its value increases you have realized a net gain. In the reverse case a net loss. The US dollar can be exchanged for many different currencies. There are certain events that can influence the value of a national currency.

Inflation: A general increase in prices and fall in the purchasing value of money
Interest Rate: Increasing bank interest rate encourage saving causing appreciation(reverse also true)
Speculation: Populace opinion influences value
Competitiveness: If goods increase in quality/value so will its currency
Relative Strength: Currency value compared to other currencies
Balance of payment: Imports vs exports(deficit vs surplus)
Government Debt: Overall national debt
Government Intervention: Price pegging, quantitative easing, etc.
Economic growth/recession: Swing in national economy will influence currency value

My Goals

Distinguish events that drive change in currency value from other news headlines.
Predict price movement based on headline content
Trigger a buy or sell order based on the prediction

Data

A Million News Headlines: Collected by the Australian Broadcasting Corporation this dataset spans a 2003-2019. The data set includes financial and non financial headlines. The headlines have been partially pre processed for consumption. The headlines have news outside of Australia and are all in English. This file is in CSV format.

Foreign Exchange Rates: A table of 22 different national currency exchange rates with the US dollar, daily from 2000-2019. This file is in CSV format.

I may add additional news sources as the project progresses.

Phase 2: Analysis & Feature Extraction

Price Analysis

There are a few major events that have contributed to the rise/fall of the USD.

2002-2007: USD loses 40% of its value as US debt increased 60% during that period.
2008: The dollar spikes in values during the financial crisis. Business were holding vast sums of the currency during this period.
2016: BREXIT fears impact the euro and other local currencies.
2018: The US initiated a trade war with several countries. Creating volatility for investors and economic turmoil.

This chart shows the average percentage change in price on a monthly basis. Using percentage change from the individual currency time series values I can extract dates of interest. I used absolute change of of 5% over 10 days to extract dates from the FOREX table. As a currency increases in value to the USD its exchange rate will lower. For example, I take a $1 position in Korean Won. I now have 1050 Korean Won when the one USD moves to 1000 Won. I now have $1.05, an increase of 5%.

Many of these currencies are highly correlated(>.70). They tend to move together relative to the USD. This is an important observation when placing orders. The same order may yield a similar result using a different currency.

Text Analysis

Using dates isolated with the most movement. We can compare the presents of words to the price movement on a given date. We can use this as a basis for modeling headlines against the price.

Additionally, sentiment scores were assigned to each headline using Valence Aware Dictionary and Sentiment Reasoner. "Vader is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media. " VADER scores of each headlines were averaged on a monthly and yearly basis. The results of the sentiment by year did not appear to coincide the change in average price. The relationship grows stronger when analyzed at a lower granularity.

Preliminary Sentiment Modeling

For my first model I used linear regression. I fed the model the monthly average sentiment scores to predict the price change. I trained on data from 2004-2015. The test results are on the graph are from 2016-2019. Upon first glance the algorithm performed poorly. The regression did perform better than a random walk. More promising, the algorithm picks up on the directional movement of the price. Logistic regression was also performed for classification but performed significantly worse.

To really test how the algorithm would perform I needed to run simulated buy/sells against the actual prices. Based on the regressions predictions I assigned buy orders to troughs and sells to peaks. The return climbs to just under 10% in 20 months then sharply drops off. To put this in context the average stock market index return is 10% a year. There is an apparent relationship between sentiment and price. More features and better modeling are required to improve performance.

Phase 3: Modeling & Performance

Objective

Predict the the direction of the next days change in exchange rate. Ultimately, predictions will be converted into buy/sell/hold positions. I would like to beat 70.59% directional accuracy using text features (Sha, Isah, Zulkernine).

Data

The first dataset, is foreign exchange data for 22 currencies from 2000-2019. The second, one million headlines from 2003-2019. The data is aggregated and inner joined. During my phase 2 linear regression modeling I rolled the data up at a monthly level. This ended up being too cumbersome for deep learning. Reducing the length of the text sequences dramatically improved performance. I also suspect that the text and sentiment scores tend to lose meaning as they are clumped together. The last two columns are target values for classification or regression respectively.

Machine Learning

Linear SCGD: “This estimator implements regularized linear models with stochastic gradient descent (SGD) learning: the gradient of the loss is estimated each sample at a time and the model is updated along the way with a decreasing strength schedule (aka learning rate).”

ADA Boosted Decision Tree: “Decision Trees (DTs) are a non-parametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target variable by learning simple decision rules inferred from the data features.”

“An AdaBoost [1] classifier is a meta-estimator that begins by fitting a classifier on the original dataset and then fits additional copies of the classifier on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases.”

Linear Regression: “Linear Regression fits a linear model with coefficients w = (w1, …, wp) to minimize the residual sum of squares between the observed targets in the dataset, and the targets predicted by the linear approximation.”

https://scikit-learn.org/stable/user_guide.html

Deep Learning

Implemented using the Keras functional API. This is a regression model with two input layers. The first input being the tokenized headlines. The second input being sentiment scores and the average foreign exchange rate of that day. The two inputs are passed to two different LSTM layers. Then merged and passed through two activation layers. Predicting the change in foreign exchange rate of the next day (train_Y,test_Y). The number of additional inputs can be adjusted by changing the second axis of the “all” input layer.

Performance

While it’s hard to compare results with different inputs. That doesn’t really matter when on classification and regression models perform poorly. The problem is heavily weighted to binary classification (50% is guessing).I discovered I was inputting the data shuffled over late in the semester. This was due to default parameters in train_test_split(). This left me little time to rebuild my deep learning models. Those models were originally performing at 76% accuracy. After retraining my models I kept running into over fitting or lack of learning features depending on the inputs.

Back to the drawing board: Galpin Oscillator

In Dark Blue:

Y=Price Difference between last peak and trough

X=Number of days between peak and trough (2 days)

In Green:

Prediction=(Current change in price)+((Inverse of Y)/(X days))

The idea behind this is pretty simple. Equities and Forex tend jump back and forth in the short term. The model takes the change in price or percentage change as an input. The model looks backwards and determines the X and Y distance between the nearest confirmed trough and peak. Obviously, these lists do not include the item you are trying to predict. Peaks and troughs are assumed if the element prior is below/above. Which is why the model performs better on single day movements rather than multiple days.

MSE = .1369

Conclusion

I did not end up getting accurate results using text features. I found this to be a disappointing result. In the future, I would like to be able to accurately do regression using text features. In the end, I was able to obtain accurate regressions from a custom model using the foreign exchange data. I plan to test my new model on live data soon. Overall the challenge greater than I expected when I started the semester. Given I had to pivot very late in the semester. I am satisfied with the result.

With more time?

Train my RNN on days of with the most movement instead of the whole dataset
Used a smaller vocabulary to assign weights
Analyzed bigrams and N-gram combinations
Texts should be domain specific.

Lessons Learned

Check the default parameters of any libraries you used. I had major setback not realizing my time series data was out of order because of train_test_split(shuffle=True).
Not all models are transferable. Some problems may have compatible code but not all.
Text regression over time series is very different than text classification.
Don’t be afraid to step outside the box.
Validate, then do it again!

Literature & Work Cited

“Twitter mood predicts the stock market (Bollen, Mao, Zeng)”: Compares the Dow Jones to the twitter data. The analysis shows a predictive correlation between the google profile mood state “Calm” and Dow Jones behaviors 4 days later. Authors admitted ground truth of tweets could not be measured. Geographic factors should also be considered. Key Takeaway: Sentiment Analysis result cans be ambiguous at scale.

https://arxiv.org/pdf/1010.3003.pdf

“Using NLP on news headlines to predict index trends” (Velay, Daniel): This paper compares the Dow Jones Industrial Average to the top 25 headlines of each open market day. They tried several deep and standard learning algorithms, the best being logistic at 57% accuracy. 50% would be random guessing. Key Takeaway: Just running the data through an algorithm isn’t going to work.

https://arxiv.org/pdf/1806.09533.pdf

“Impacts of Public News on Stock Market Prices: Evidence from S&P 500” (Ormos, Vasonyi): Collected publications and isolates days of interests based on stock movement. They then isolated the most frequent positive and negative nouns and adjectives. Key Takeaway: Days of Interest can isolate large sets of NLP data.

https://www.researchgate.net/publication/228566692_Impacts_of_Public_News_on_Stock_Market_Prices_Evidence_from_SP5001

“Predicting the Effects of News Sentiments on the Stock Market” (Sha, Isah, Zulkernine): This team manually isolated words of interest in a dictionary and assigned sentiment scores to each using domain expertise. The dictionary model assigned scores and made decisions. It was able to achieve directional accuracy of 70.59%. Key Takeaway: Narrowing valuable input manually early on will improve performance. Simple approach, good result.

https://arxiv.org/ftp/arxiv/papers/1812/1812.04199.pdf

Github Link

Page updated

Report abuse