HKUST-DIDI Lab - supply-demand forecasting

spatio-temporal supply-demand forecasting

Introduction

The on-demand ride service platform, e.g., Urber, Lyft, DiDi Chuxing, is an emerging technology with the boom of mobile internet. Ride-sourcing or transportation network companies (TNCs) refer to an emerging urban mobility service mode that private car owners drive their own vehicles to provide for-hire rides (Chen et al., 2017). On-demand ride-sourcing services can be completed via smartphone applications. The platform serves as a coordinator who matches requesting orders from passengers (demand) and vacant registered cars (supply). There exists an abundance of leverages to influence drivers’ and passengers’ preference and behavior, and thus affect both the demand and supply, to maximize profits of the platform or achieve maximum social welfare. Having better understanding of the short-term passenger demand over different spatial zones is of great importance to the platform or the operator, who can incentivize drivers to the zones with more potential passenger demands, and improve the utilization rate of the registered cars.

However, short-term forecasting of passenger demand or on-demand ride services in each region is of great challenge mainly due to three kinds of dependences (Zhang and Zheng et al.):

(1) Time dependences: passenger demand has a strong periodicity (for example, the passenger demand is expected to be high during morning and evening peaks and to be low during sleeping hours); furthermore, the short-term passenger demand is dependent on the trend of the nearest historical passenger demand.

(2) Spatial dependences: Yang and Leung et al. (2011) revealed that the passenger demand in one specific zone was not merely determined by the variables of this zone, but endogenously dependent on all the zonal variables in the whole network. Generally, the variables of the nearby zones have stronger influences than distant zones, which inspires the need for an advanced model that can capture local spatial dependences.

(3) Exogenous dependences: some exogenous variables, such as the travel time rate and weather conditions, may have strong influences on the short-term passenger demand. The exogenous variables also demonstrate time dependences and spatial dependences.

Although few direct experience suggests solutions to these three dependences in short-term passenger demand forecasting, studies on traffic speed/volume prediction and rainfall nowcasting provide valuable insights (Ghosh, Bidisha et al., 2009; Huang, Shan et al., 2009; Guo, Jianhua et al., 2014; Wang, Jian et al., 2014). Recently, deep learning (DL) approaches have been successfully used for traffic flow prediction. For example, Ma et al. employed the long short-term memory (LSTM) neural network to capture the long-term dependences and nonlinear traffic dynamics for short-term traffic speed prediction (Ma and Tao et al., 2015). Wu et al. incorporated 1-dimension convolutional neural network (CNN) and LSTM in short-term traffic flow forecasting in order to capture spatio-temporal correlations (Wu and Tan). Zhang et al. presented a deep spatio-temporal residual network to predict the inflow and outflow in each region of a city simultaneously (Zhang and Zheng et al.). Shi and Chen et al. (2015) innovatively integrated CNN and LSTM in one end-to-end DL structure, named the convolutional LSTM (conv-LSTM), which provided a brand-new idea for solving spatio-temporal sequence forecasting problems. In that research, numerical experiments showed that the conv-LSTM outperformed fully connected LSTM in two datasets.

In this paper, we propose a novel DL structure, named the fusion convolutional LSTM network (FCL-Net), to consider the three dependences simultaneously in the short-term passenger demand forecasting for the on-demand ride service platform. Different from aforementioned studies, this structure coordinates the spatio-temporal variables and non-spatial time-series variables in one end-to-end trainable model. Before feeding these explanatory variables into the DL structure, a tailored spatial aggregated random forest is designed to evaluate the feature importance with different categories, look-back time intervals, and spatial locations.

Methodology

We propose a novel fusion convolutional LSTM network (FCL-Net), which integrates spatio-temporary variables and non-spatial time-series variables into one DL architecture for short-term passenger demand forecasting under the on-demand ride service platform. Conv-LSTM layers and convolutional operators are employed to capture characteristics of spatio-temporary variables, while LSTM layers are implemented for non-spatial time-series variables. To fuse these two categories of variables, techniques including repeating and transformation functions, are utilized in the structure. The framework of our model is illustrated in Fig. 1. For more details of the model, please refer to our paper (Ke et al., 2017).

Fig. 1. The Deep learning architecture of our model

Experiment

The datasets utilized in this paper are extracted from DiDi Chuxing, the largest on-demand ride service platform in China, during one-year period between November 1, 2015 and November 1, 2016. We randomly obtain 1,000,000 requesting orders from the platform, each of which consists of the requesting time, travel distance, travel time, longitude and latitude. The study site is located in Hangzhou, China, starting from 120.00 to 120.35 in longitude, and from 30.45 to 30.15 in latitude. The dataset is partitioned into 1-hour time intervals, and the investigated region is partitioned into grids, as shown in figure X. The one-hour aggregated weather variables, including temperature, humidity, weather state, wind speed, and visibility, are obtained during the same period.

To avoid using future information, the dataset is divided into 70% training dataset comprised of observations between November 1, 2015 and July 14, 2016, and the 30% test dataset consisting of the remaining observations between July 15, 2016 and November 1, 2016. The performance of our model and benchmarks is shown in Fig. 2.

It can be found that the proposed FCL-Net outperforms other methods. Both FCL-Nets have relatively 50.9% lower RMSE than Conv-LSTM with only historical demand intensity, which indicates that the exogenous variables make great contribution to the short-term passenger demand forecasting. As mentioned above, the proposed spatial aggregated random forest reduces the computation complexity of FCL-Net by 29.5% (the number of variables of in each observation drops from 840 to 592). Meanwhile, Table X shows that FCL-Net only suffers a 0.6% decrease measured by RMSE, 0.1% decrease by R^2, or 1.1% decrease by MAE, on predictive performance after feature selection. The results indicate that the feature selection process is valuable to FCL-Net since it balances the computation complexity and predictive performance.

Fig.2. Model comparison

Reference

Ke, J., Zheng, H., Yang, H. and Chen, X.M., 2017. Short-term forecasting of passenger demand under on-demand ride services: A spatio-temporal deep learning approach. Transportation Research Part C: Emerging Technologies, 85, pp.591-608. [PDF]