Authors
Joon-Seok Kim (George Mason University), Dieter Pfoser (George Mason University), Andreas Züfle (George Mason University)
Abstract
Each year, traffic congestion wastes billions of liters of fuel and is a significant contributor to Green House Gas (GHG) emissions. Although convenient, ride-sharing services such as Uber, Lyft, and Didi are becoming a significant contributor, as drivers not only transport passengers, but they also spend time on the road searching for new passengers. To help alleviate the impact of ride-sharing on fuel waste and GHG emissions, the milestone reached by this work is a novel algorithm to improve the efficiency of drivers searching for customers. Our algorithm directs unassigned drivers to locations where we predict new customer to appear. We use a non-negative matrix factorization approach to model time and location of resources given historical training data. Given the resulting prediction of future resources, we employ a probabilistic search strategy to guide drivers to nearby locations where resources are predicted within a short time span. To ensure that agents do not oversupply areas for which resources are predicted and undersupply other areas, we randomize the destinations and provide each driver with a home location to return to when unassigned. Our experimental evaluation shows that our approach reduces the search time of agents and the wait time of resources using real-world data from Manhattan, New York, USA compared to baseline solutions.
Learning spatio-temporal resource distribution
Predicting future resources
Avoiding herding (oversupply, undersupply)
Avoiding unnecessary long distance trips
Generalization: Assuming that resources vary between days, the algorithm should adapt to random deviations in resource distributions, rather than overfitting to the resource distribution in the training data set.
Balancing Utility and Distance: Agents should find a balance between exploration (moving to more useful locations) and exploitation (remaining in the current location). The algorithm must find a balance between these two aspects.
Fairness: Without allowing agents to communicate with each other, the algorithm should evade flocking towards the same destination, to avoid oversupplying some regions while causing starvation in others.
Timing: Agents should move towards regions where their is a maximal chance of finding a resource at the time of arrival.
The left figure shows the resulting spatiotemporal resource matrix for the 30 days of June 2016. Each line of this matrix corresponds to an edge in the spatial network G and each column corresponds to one of the 20-minute intervals during these 30 days. The color intensity of each cell indicates the number of resources observed in the New York City Trip Record Data dataset.
The right figure shows the search process for the agent Q located at the center of the concentric circles. Small red dots denote all resources predicted to appear within the time horizon. From all these resources, N=8 samples are chosen uniformly at random, highlighted by larger orange dots. Weighted inversely by distance following the equation in the figure, one of the N samples is chosen as the destination for Q.
The following figure visualizes a factorized matrix , which describes each location by six latent features. Components show the strength of each of these latent features for each edge of the network. An interesting case if Component 2, which has high values only in two locations close to each other: Pennsylvania Train Station and Pennsylvania Metro Station, two major transportation hubs. The matrix factorization has learned that these locations have unique temporal features that are not observed in other places.
Component 1
Component 2
Component 3
Component 4
Component 5
Component 6
Random Walk: requires agents to cruise in random directions. Thus, whenever a random walk agent reaches an intersection, they will continue into a direction chosen uniformly at random.
Random Destination: chooses a random destination uniformly random from all network nodes.
The search time of a driver is the amount of time from when the driver is labeled as empty until it picks up a passenger. A driver may experience multiple search times, each corresponding to one assignment.
The wait time of a passenger is the period of time from its introduction until its pick-up or expiration.
The expiration rate is the percentage of expired resources, meaning the passenger will not wait any longer after expiration limit.
Source code is available at https://github.com/joonseok-kim/CompetitiveSearch
Related Research & Presentation:
J.-S. Kim, D. Pfoser, and A. Züfle, “Distance-Aware Competitive Spatiotemporal Searching Using Spatiotemporal Resource Matrix Factorization (GIS Cup),” In Proceedings of ACM SIGSPATIAL GIS ’19, November 2019, pp. 624-627 (1st-Place Winner)
J.-S. Kim, Distance-Aware Competitive Spatiotemporal Searching Using Spatiotemporal Resource Matrix Factorization, 20th KOCSEA Technical Session, November 16, 2019, Atlanta, Georgia, USA (Best Poster Award)