Time Series Analysis of Royal Caribbean Stocks

11/23/19

Understanding Royal Caribbean International stock prices using time series analysis

What is time series analysis?

Time series analysis is the term given to one-dimensional processes. It refers to understanding and modelling data which varies with time. This takes the form of:

y(t) = x t,

where t is time, y is the dependent variable and x can be a constant or a function related to time at time=t.

The applications of understanding time-series process are wide and include:

  • Financial forecasting
  • Anomaly detection
  • Earthquake prediction

Obtaining RCL stock data

We will use alphavantage which provides a simple python API to download stock data. It provides the data in a csv file and can be opened using pandas.read_csv. I created a function to make it easy to download daily stock data from any company by inputting their stock symbol:

def get_stock(symbol):
    return pd.read_csv('https://www.alphavantage.co/query?function=TIME_SERIES_DAILY&symbol=' +\
                       symbol + '&apikey=APIKEY&outputsize=full&datatype=csv')

You can download RCL stock data as:

rcl = get_stock('RCL')

to simplify the data keep only the closing price and set the index to be the time:

rcl['timestamp'] = pd.to_datetime(rcl['timestamp'])
rcl = rcl.loc[::-1, ['timestamp', 'close']]
rcl = rcl.set_index('timestamp')
rcl = rcl['close']

you can quickly plot the data by calling plot().

Comparing Royal Caribbean and Carnival stock data

Download the data for Carnival Corporation & Plc (CCL) and plot them against RCL stocks:

ccl = get_stock('CCL')
ccl['timestamp'] = pd.to_datetime(ccl['timestamp'])
ccl = ccl.loc[::-1, ['timestamp', 'close']]
ccl = ccl.set_index('timestamp')
ccl = ccl['close']

fig, ax = plt.subplots(1, 1, figsize=(14, 7))
rcl.plot(ax=ax)
ccl.plot(title='Royal Caribbean and Carnival Daily Stock Prices', ax=ax)
ax.set_ylabel("Stock Price ($)")
ax.legend(["RCL", "CCL"]);

The stocks have a fairly similar pattern. However, RCL's stock price can be seen to take off around 2014. This is related to the "Double Double" where the CEO Richard Fain stated the company will double earnings per share from 2014 to 2017 and increase return on invested capital to double digits [1]. Another noticeable difference is Carnival's stock price can be seen to decline more that Royal Caribbean's in 2019. You can quantify where the divergence is largest by applying a rolling correlation to both stock data sets and finding the minimum (August 13th 2019).

rolling_r = rcl['close'].rolling(365).corr(ccl['close']) # 365-day rolling window
rolling_r[rolling_r == rolling_r.min()]

Modeling Royal Caribbean stock price

I'm are going to apply the fast.ai approach of learning which is the reverse of traditional approaches: focus on practical solutions first followed by the theory.

One of the quickest ways to get a good time-series model is to use facebook's open source package prophet [2]. Put the data in a format that prophet can ingest:

from fbprophet import Prophet
df = pd.DataFrame()
df['ds'] = rcl.index
df['y'] = rcl['close'].values

and fit a model to it:

m = Prophet().fit(df)

You can see the model fit by calling attributes of the object:

See how well the model fits the data by getting the model to predict the data it is fitted to:

future = m.make_future_dataframe(periods=0)
forecast = m.predict(future)

I created a function to make the plot look pretty:

from fbprophet.plot import add_changepoints_to_plot

def plot_model():
    fig, ax = plt.subplots(1, 1, figsize=(14, 7))
    forecast_t = forecast['ds'].dt.to_pydatetime()
    ax.plot(m.history['ds'].dt.to_pydatetime(), m.history['y'], 'k.', label='Actual')
    ax.plot(forecast_t, forecast['yhat'], ls='-', c='#0072B2', label='Modeled')
    fmt = '${x:,.0f}'
    tick = mtick.StrMethodFormatter(fmt)
    ax.yaxis.set_major_formatter(tick)
    ax.set_ylabel('Stock Price ($)', fontsize=20)
    ax.set_xlabel('Date', fontsize=20)
    fig.autofmt_xdate()
    plt.tick_params(labelsize=20)
    plt.title('Royal Caribbean Daily Stock Prices', fontsize=20)
    a = add_changepoints_to_plot(ax, m, forecast, trend=False)
    # Put a label on one change point to add it to the legend
    ax.axvline(x=m.changepoints.iloc[0], c='r', ls='--', label='Changepoints')
    ax.legend(loc=2)
    plt.show()

plot_model()

The blue line is the model fit and it is in fairly good agreement with actual values (black circles) prior to 2016. Because prophet is primarily a forecasting tool, by default it creates changepoints (the dotted red lines) over the first 80% of the data (until 2016). To get a much better fit we can specify changepoints over the entire range and also double the number of changepoints:

m = Prophet(changepoint_range=1, changepoint_prior_scale=50.0).fit(df)

future = m.make_future_dataframe(periods=0)
forecast = m.predict(future)

plot_model()

How does fbprophet work?

Simple time-series models

To understand the basics of time series modelling I recommend reading the blog post by blackarbs [2]. I have added lots of footnotes for useful resources on time-series modelling.

Modular regression model

A full understanding of prophet is given in the white paper by Taylor and Letham (2017). It is a modular regression model. The modules capture a certain part of the time-series. These are modeled separately and added together to make up the predicted time-series. I have listed the components below. I have also given a description of how they are modeled and provided a link to the source code:

  • The trend (linear or logistic) with changepoints - g(t) - modeled using piece-wise linear regression [source]
  • Monthly seasonality - s(t) - modeled using Fourier analysis [source]
  • Weekly seasonality - s(t) - modeled using Fourier analysis [source]

The equations are shown below:

To solve/optimize the equations prophet uses wrappers to pystan. pystan provides a maximum prior estimate and also a full posterior inference for model parameter uncertainty.

Further reading

In recent years there have new approaches to modelling time-series using machine learning. This is outside the scope of the blog post but to get an introduction I recommend watching Aileen Nielsen's talk from SciPy which is given below. The resources for her talk are available on GitHub here.

References

[1] RCL blog, THE DISCIPLINE BEHIND “DOUBLE DOUBLE”, January 2015, https://www.rclcorporate.com/the-discipline-behind-double-double/

[2] blackarb blog, Time Series Analysis (TSA) in Python - Linear Models to GARCH, November 2016, http://www.blackarbs.com/blog/time-series-analysis-in-python-linear-models-to-garch/11/1/2016

[3] Taylor SJ, Letham B. 2017. Forecasting at scale. PeerJ Preprints 5:e3190v2 https://doi.org/10.7287/peerj.preprints.3190v2

Footnotes

https://towardsdatascience.com/time-series-analysis-in-python-an-introduction-70d5a5b1d52a

https://forums.fast.ai/t/time-series-sequential-data-study-group/29686

https://www.oreilly.com/ideas/machine-learning-and-analytics-for-time-series-data

S&P 500 Stock Price Prediction Using Machine Learning and Deep Learning

https://www.liip.ch/en/blog/time-series-prediction-a-short-comparison-of-best-practices

https://machinelearningmastery.com/arima-for-time-series-forecasting-with-python/

https://machinelearningmastery.com/tune-arima-parameters-python/

https://machinelearningmastery.com/time-series-data-stationary-python/

https://machinelearningmastery.com/autoregression-models-time-series-forecasting-python/

https://www.machinelearningplus.com/time-series/arima-model-time-series-forecasting-python/

https://www.analyticsvidhya.com/blog/2018/08/auto-arima-time-series-modeling-python-r/

http://www.blackarbs.com/blog/time-series-analysis-in-python-linear-models-to-garch/11/1/2016

https://www.kaggle.com/c/two-sigma-financial-news/overview

https://www.kaggle.com/jagangupta/time-series-basics-exploring-traditional-ts

https://mc-stan.org/users/documentation/tutorials.html

https://github.com/firmai/machine-learning-asset-management

https://github.com/timeseriesAI/timeseriesAI

https://github.com/alan-turing-institute/sktime