Disclaimer: Past performance is no guarantee of future results. I am not a licensed professional and these opinions do not reflect the opinion of my company. This project is intended more for academic purposes rather than guidance.
A Global Pandemic, protests, an election, and ultimately; a time of uncertainty. No one knows how long it will take the world to develop a vaccine, and until then, our financial markets are going to remain unstable. What are the lucrative investment opportunities when a vaccine is created and the year that follows it.
What are the current market conditions? How has the Coronavirus impacted stock prices and why is investing now such a troubling time?
There are many companies researching Coronavirus vaccines, could investing in these companies be the most prudent option?
We are in the middle of a textbook definition of a recession (two consecutive quarters of GDP decline). Do certain stocks perform better following a recession?
Can machine learning be used to classify the outperforming stocks from the underpeforming stocks? If so, what are the characteristics that separate winners from losers?
Daily New U.S. Covid-19 Confirmed Cases: Provided by the CDC [Used in figure 1].
Daily S&P 500 Daily Closing Prices (12/01/2019 - 6/26/2020): Provided by Yahoo Finance [Used in figure 1 & 2].
Daily S&P 500 Daily Closing Prices (12/31/1967 - 12/31/1967): inspired by Bloomberg article [Used in figure 2].
Mosiac Dataset of 2008 - 2018 Russell 3000 Index Constituents : Provided by Bloomberg Terminal subscription, but purely for analysis. None of the underlying data is made accessible in this project. The Russell 3000 Index tracks the performance of approximately 98% of all U.S. incorporated equity securities (https://www.investopedia.com/terms/r/russell_3000.asp ). [Used in model construction].
Efficient Frontier Portfolio: Finding a portfolio of all the winners, but weighting them to find highest return with the least amount of risk. Daily closing prices of stock data from yahoo [Used in Deliverable 3].
(Fig. 1) Year to Date S&P 500 Daily Prices vs. New U.S. Covid-19 Daily Cases - These animated charts show their respective changes from 12/01/2020 up to 6/26/2020. The trough of the S&P 500 index is 10 days prior to the peak of new daily Covid-19 cases in the United States. Moreover, as we are seeing a new surge in late June, the stock market has also taken a dip. The visual makes a strong case that the two could be negatively correlated.
(Fig. 2) The stock market doesn't care. Better yet, it never has - The top visual again documents the S&P 500 Index's daily closing prices, but it also annotates significant world events. Namely, the Coronavirus, protests, and space activity. Surely, most of these events should negatively impact stock prices? The surprising results show a quick V shape recovery from its 3/23/2020 trough. 2020 is not an outlier, if we observe an eerily similar year (e.g. 1968), we can observe some common themes.
(1968) Stemming from Hong Kong, the H3N2 Pandemic Flu killed 1 million people worldwide and 100,000 Americans vs. Covid-19
(1955 - 1975) Vietnam War vs. recent U.S. tensions with China and Iran
(4/4/1968) MLK was assassinated, leading to protests and riots. Additionally, students all over the world gathered to protest the Vietnam War vs. the protests against police brutality that occurred in every state
(10/11/1968) Apollo 7 mission vs. (5/30/1968) Space X launch
Not only are these events similar in type, these events seemed to have minimal negative impact on S&P 500 performance.
Public Data - How is an individual investor to know which company to pick based on public information?
Barrier to Entry - Being a global pandemic, institutions across the world have entered this arms race. There are not straightforward approaches if the company is overseas? Although, buying ADR's could be a way around this. Moreover, What if it's a private company? Most individual investors can only buy public shares.
Efficient Market Hypothesis - The belief that share prices reflect all information and that consistently beating the market is impossible. Giving an example; if a company reports vaccine progress, their stock prices surge! If a company's stock price is already up 2,000% (NVAX's approximate YTD performance as of 6/30/2020), how much more do you think its going to go up?
Government Intervention - A Covid-19 vaccine would presumably be very profitable for a company. However, it's in the world's best interest to make this vaccine accessible; meaning the vaccine cannot be too expensive. While governments are not always the best at this, it is possible they may step in and create limits on the maximum price of the vaccine.
Revenues Impact on Share Price - Just because a company creates a drug or vaccine that generates a lot of revenue, that doesn't mean its share price is going to change by the same magnitude or direction.
While a drug is not the same as a vaccine, there are a handful of examples where a Pharmaceutical company developed a Blockbuster Drug, generating that same company more than a billion dollars. I was curious how this revenue growth impacted a company's share price and imagined this could be a good comparison on how a company's creation of a Coronavirus vaccine may impact it's company's share price.
Countless hours were invested in finding fundamental data points about companies in the Russell 3000 index that could glean insights into what factors contribute to good investments. While some initial thoughts and strategies were useful, they did not produce reliable data.
In 2002, Joseph D. Piotroski published a journal for The University of Chicago Graduate School of Business titled, "Value Investing: The Use of Historical Financial Statement Information to Separate Winners from Losers ". Piotroski created a list of 9 categories where he would award them one point if they demonstrated signs of a strong company in their respective category and no points if they did not meet the requirements. Using this methodology as a screener, he believed this could help determine what companies could be good long term investments.
Positive Net Income (1 point)
Positive return on assets in the current year (1 point)
Positive operating cash flow in the current year (1 point)
Cash flow from operations being greater than net Income (quality of earnings) (1 point)
Lower ratio of long term debt in the current period, compared to the previous year (decreased leverage) (1 point)
Higher current ratio this year compared to the previous year (more liquidity) (1 point)
No new shares were issued in the last year (lack of dilution) (1 point).
A higher gross margin compared to the previous year (1 point)
A higher asset turnover ratio compared to the previous year (1 point)
(Chen, James. 5 Feb. 2020, www.investopedia.com/terms/p/piotroski-score.asp.)
Piotrsoky believed that if a company had a score of 8 or higher, then that company would be an excellent value opportunity going into the consecutive year. Moreover, his study showed that "an investment strategy that buys expected winners and shorts expected losers generates a 23% annual return between 1976 and 1996" (Piotrosky, 2002).
While Piotrosky's paper, quant forums, and articles all point to this being a robust algorithm that works across time, this had to be tested. Analyzing quarterly data during the Great Recession (late 2007 - early 2009) and the following 6 quarters, I was able to compile a dataset that measured how well this hypothesis worked during a recession, after a recession, and together.
(Fig. 3) Performance by Piotroski's F-Score During Great Recession (2008) - Companies with an F-score of 9 demonstrated a 59% likelihood of outperforming other Russell 3000 Index constituents in the following year.
(Fig 4.) Performance by Piotroski's F-Score, Post Great Recession - At 67%, companies with an F-score of 9 after the Great Recession demonstrated an even stronger likelihood of outperforming other Russell 3000 Index constituents in the following year. Not as significant, but companies that scored an 8, outperformed 54% of the time in the subsequent year.
(Fig. 5) Performance by Piotroski's F-Score (2007Q4 - 2010Q4) - Analyzing both datasets together, companies with an F-score of 9 demonstrated a 61% likelihood of outperforming other Russell 3000 Index constituents in the following year. Overall, finding companies that score a 9 seems like a reliable approach.
The screener itself demonstrated great insights, but we can build off of this. One, by finding new significant variables. Two, employing machine learning techniques in hopes of finding additional rules for successful companies. Three, mitigating risk.
Disclaimer: Past performance is no guarantee of future results. I am not a licensed professional and these opinions do not reflect the opinion of my company. This project is intended more for academic purposes rather than guidance.
Employing autoML, Decision Trees, SVM, Neural Networks, and random forests; random forests yielded the most accurate and explainable answers. In worst case scenarios, the modle predicted Outperformers correclty ~50% of the time. In best cases, the model could predict Outperformers ~64% of the time. The top middle figure shows a sample of one of the trees. Random forest is a supervised learning algorithm. The "forest" it builds, is an ensemble of decision trees, usually trained with the “bagging” method. The general idea of the bagging method is that a combination of learning models increases the overall result. Below are images are taken from one run on the training dataset.
(Fig 6. Top Middle) shows a random tree from the model. This is a glimpse into how the random forest model makes its predictions.
(Fig 7. Bottom Left) shows the importance level of each variable in predicting the outcome.
(Fig 8. Bottom Right) shows the additional explanation of predictions with each variable appended. The most significant are those starting at the left.
Book Value per Share and Book Value to Market Value, in every run, scored the highest in importance to the model.
Book Value per Share and Book Value to Market Value explained almost 80% of the model in this run.
Having the law of large numbers in mind, one may think guessing correct above 50% of the time would be sufficient enough to make a killing over time. Paraphrasing the words of a classmate, "Any [wrong] company's stock you pick, is working against you and hurting you." Therefore, one can mitigate risk by building a portfolio that maximizes expected return for a defined level of risk. You can think of this as a portfolio that offers you the best bang for your buck.
(Fig 9. Above) Represents all the securities from 6/30/2020 that scored a perfect 9 Piotrosky F-Score. MWA is highlighted in orange because the random forest predicts this security will underperform and is not included in portfolio construction. (Fig 10. Right) Visualizes a simulation of 2,500 different portfolios and determines the allocation to create a Max Sharpe Ratio (MSR) Portfolio of the predicted winners as a red star. The green star designates the Global Minimum Variance (GMV) portfolio that is still on the efficient frontier grouping, but with less risk.
After selecting all 16 predicted stocks as likely winners, we can run simulations of different portfolio weightings and find the most efficient portfolios. Portfolios that give us the highest expected return with the least additional volatility (risk).
Finding the 16 Value companies, three portfolios were created. One that was equally weighted (gold), one that was weighted according to the Global Minimum Variance (red), and one that was weighted according to the Max Sharpe Ratio. These allocations were backtested to measure their cumulative performance from June 2010 - June 2020.
It is noteworthy, that an investor could have outperformed an equally weighted portfolio and still mitigated risk. Moreover, and investor could have achieved significant returns and limited theirs by allocating according to the MSR.
The MSR portfolio appears to be the best 'bang for your buck' portfolio. Factoring in the portfolios mean return, standard deviation, and 252 trading days in a year, we can run 100 scenarios of how the portfolio may react.
Evaluate Survivroship Bias impact to the model. Only rows (companies) that had fundamentals for every variable were included. Could this have misrepresented my training and test datasets?
The F-Score was a significant contributor to choosing investments. However, it would be helpful to include more variables so I am not relying as heavily on this one screener. Additional variables could lead to more machine learning opportunities.
The scope of this project was analyzing winners by one year performance. A deeper dive into how
The MSR appears to be the most attractive portfolio, but I would be intrigued how MSR portfolios perform in the long run. The GMV may be a better asset allocation if the investor is looking to hold these securities for a longer period.
Bloomberg Terminal (for research purposes, any underlying data is not available on this website. Manipulated and de-identified data is available on github). Querying data to test Pitrosky F-score Screener.
Chen, James. “Learn What a Piotroski Score Is.” Investopedia, Investopedia, 5 Feb. 2020, www.investopedia.com/terms/p/piotroski-score.asp. Define the Piotrosky F-Score Variables.
Donges, Niklas. “A Complete Guide to the Random Forest Algorithm.” Built In, 2020, builtin.com/data-science/random-forest-algorithm. Improve on Random Forest Development.
Kim, Ricky. “Efficient Frontier Portfolio Optimisation in Python.” Medium, Towards Data Science, 11 Jan. 2019, towardsdatascience.com/efficient-frontier-portfolio-optimisation-in-python-e7844051e7f. Construct an optimal portfolio of stocks
Koehrsen, Will. “How to Visualize a Decision Tree from a Random Forest in Python Using Scikit-Learn.” Medium, Towards Data Science, 19 Aug. 2018, towardsdatascience.com/how-to-visualize-a-decision-tree-from-a-random-forest-in-python-using-scikit-learn-38ad2d75f21c. Improve on Random Forest
Pathak, Manish. “TPOT in Python.” DataCamp Community, 2018, www.datacamp.com/community/tutorials/tpot-machine-learning-python. Use AutoML to uncover new algorithms and establish a baseline for model performance.
Piotroski, Joseph D. “Value Investing: The Use of Historical Financial Statement Information to Separate Winners from Losers.” Journal of Accounting Research, vol. 38, 2000, p. 1., doi:10.2307/2672906. Learn and adopt a Value Based Investing technique (the Piotrosky F-Score).
Slidehack. (2019). The X Note [PowerPoint slides]. Retrieved July 1, 2020, from https://elements.envato.com/all-items/slidehack+x+note PowerPoint Templates.
Wixom, Dakota. “Introduction to Portfolio Risk Management in Python.” DataCamp, 2020, www.datacamp.com/courses/intro-to-portfolio-risk-management-in-python. Build a portfolio of stocks while minimizing risk.
“Yahoo Finance - Stock Market Live, Quotes, Business & Finance News.” Yahoo! Finance, Yahoo!, finance.yahoo.com/. Datasets for portfolio construction.