Analysis Statement
Understanding the rent trends can not only help real estate investors know which place is worth to invest, but also help tenants know the best season to sign the lease. Our data comes from Zillow and contains years of monthly rent price and rent per square feet from 2010-01 to 2019-06. During the data preparation process, we found some missing data value from 2010 through 2012. Since these years were too old in the past and were not a good indicator of near future rent prices.we decided to just use the years with full data for each month, which is from 2013-01 to 2019-06. Then the top 20 metro areas are chosen for data analysis by population rank. Also the data are divided into a training set (2013-2018) and a test set (2019-01 to 2019-06). After that we build time series models(ARIMA or SARIMA) on training set and compare the performance with test set in R. In the end, RMSE is used as our metric to measure the model performance.
Time Series Analysis For Top20 MSAs Rental Price (By Population Rank)
The rental price data we used for time series analysis is available here
#Time series for New York
library('forecast')
library(tseries)
library('stats')
library("TSA")
NY_train<-ts(traindata[,2],start=c(2013,1),end=c(2018,12),frequency = 12)
NY_test<-ts(testdata[,2],start=c(2019,1),end=c(2019,9),frequency = 12)
#plot(NY_train)
fit1<-auto.arima(NY_train)
fit1 #ARIMA(1,1,0)(1,1,0)[12]
checkresiduals(fit1)
NY_predict<- forecast(fit1,h=12)
accuracy(NY_predict,NY_test) #RMSE 6.3
#plot
autoplot(NY_predict,ylab="NY rental",main="Forecasts from ARIMA(1,1,0)(1,1,0)[12] For New York,NY") +
autolayer(NY_test,series="Test data")
Time Series Analysis For Top 20 MSAs Rental Price Per Square feet (By Population Rank)
The rental per square foot data for time series analysis is available here
Boston_train<-ts(traindata[,4],start=c(2013,1),end=c(2018,12),frequency = 12)
Boston_test<-ts(testdata[,4],start=c(2019,1),end=c(2019,6),frequency = 12)
plot(Boston_train,type='l',main='Rental Per Sqft in Boston')
acf(Boston_train) # ACF decrease slowly
pacf(Boston_train) # ACF and PACF looks like Random walk
# take difference
diff=diff(Boston_train)
acf(diff)
pacf(diff)
library("TSA")
eacf(diff)
fit4<-stats::arima(Boston_train,order=c(2,1,5))
checkresiduals(fit4) # p value is 0.3, larger than 0.05, white noise
library('stats')
Boston_predict<- forecast(fit4,h=12,level = c(80, 95))
accuracy(Boston_predict,Boston_test) #RMSE 0.01
#plot
autoplot(Boston_predict,ylab="Boston Rental Per Sqft",main="Forecasts from ARIMA(2,1,5) For Boston, MA") +
autolayer(Boston_test,series="Test data")
Conclusion:
As we can see from above plots, the red line is real observation, and the blue color is our confidence level. Most of observations are within our prediction confidence interval (95% confidence ). Our model combined average RMSE is 11.5.
Based on these plots, we found most of metropolitan areas’ rent are rising year by year and have seasonal pattern. For instance, Boston’s rental price is highest in August and September, and then dropped down in winter and early spring. We did some research about this and found that because Boston has many universities, and the September is the new school season, almost tens of thousands of young people pack into the city every September, which drives up the rental price. Therefore, if you are a tenant, find the rental and sign the lease between December to march can help you save on rent.
Forecast 2019 Year-Over-Year Changes in Rent Price For Top 20 Metro Areas (By Population Rank)
Based on our time series models, we use the prediction data in 2019-12 compared with 2018-12 to check the year over year rental price change of the top 20 Metro areas. We conclude that the rental rate are likely to increase most 4.32% in Miami-Fort Lauderdale, FL, followed by Atlanta, Dallas-Fort Worth. And rent rate in Houston is expected to decline 0.43%.