Bus Routes and Station Placements in a City
posted in 2025
posted in 2025
(I) Objective: Design a reinforcement learning (RL) system to optimize bus routes and station placements in a city network. The goal is to minimize costs (bus station construction, bus purchases) while reducing passenger travel times, considering dynamic factors like traffic speeds, one-way streets, military restricted areas, varying passenger volumes/hotspots/destinations by time of day, walking times to stations, and in-bus congestion times.
(II) Problem Modeling and Assumptions:
City Representation: Model the city as a directed graph (NetworkX).
Nodes are potential bus station locations.
Edges are roads with attributes:
Distance.
Average speed by time slot (e.g., rush hour vs. off-peak).
One-way (directed edges).
Restricted areas: Some nodes/edges are military zones (blocked).
Passengers:
Origins/destinations: Varying by time slots (e.g., morning: more commuters to work hotspots; evening: to residential).
Passenger count: Varying by time slots.
Time Slots: Divide day into slots with different speeds, demands, hotspots.
Costs:
Station construction: Fixed cost per selected station.
Bus purchase: Cost per bus, with multiple buses allowed on same/different routes.
Passenger Times:
Walking time: Distance from passenger origin to nearest station.
In-bus time: Travel time on route.
Total time: Walking + waiting + in-bus.
Objectives: Multi-objective reward in RL – minimize passenger time while minimizing costs.
Reward = - (total passenger time + costs).
(III) RL Environment Design:
State: Vector representing:
Current selected stations (binary mask).
Allocated buses per route.
Current time slot.
Passenger count.
Actions: Discrete multi-action space:
Select/deselect a station.
Add/remove a bus to a route.
Choose route path: select start/end nodes, use shortest path avoiding restrictions.
Reward:
Positive for reducing average passenger time.
Negative for costs (stations + buses).
Penalty for violating restrictions or high times.
Episode: Starts with empty plan, ends after fixed steps or when budget exhausted. Simulate passenger flows to compute times/costs.