Data Sources
For our analysis, we employed a mix of demographic and operational data
AC Transit bus route GIS shapefile and bus stop GIS data shapefile from AC Transit's Data API & Resource Center
Demographics and population characteristics data from the U.S. Census and Social Explorer
Boarding/alighting data at each stop from AC Transit
On-time performance data from AC Transit
Real-time delay data from SF Bay 511
Smartphone and Internet access data from U.S. Census
Community anchor institution data from California Public Utilities Commission
Bus shelter guidelines from AC Transit
Methodology
Spatial data
We first filtered the spatial data from AC Transit's website to only include those stops associated with our three lines of interest, AC Transit's 19, 20, and 51A lines. We did this in QGIS as we found it aided our initial understanding of the area.
Census data
We determined which Census tracts these three lines ran through and conducted a tract-level analysis of various population characteristics, such as race and ethnicity, percent of residents taking transit, and percent of residents with smartphone access. This was primarily done through geopandas maps and plotly bar charts.
Delay data
We looked at data for the historical on-time performance and present real-time delays for these lines and stops to evaluate the frequency and reliability of these lines. The real-time delay data was collected via Open 511 SIRI APIs. This data is provided in an XML format and has information such as bus expected and aimed arrival and departure times. To determine which stops have the greatest average delays, we gathered this real-time delay data during different times of day (peak and off-peak, AM and PM) throughout the week. We were able to create interactive maps using plotly to show those stops on each line that had average delays over five minutes, in accordance with AC Transit's definition of "on-time performance."
Additional data
To round out our analysis, we each took on new data to explore. We looked at levels of smartphone ownership and Internet access in the surrounding tracts, walkability around highest ridership bus stops via walkshed maps, and access to the nearest community anchor institutions from all stops. We obtained tract-level smartphone ownership and Internet access data through the U.S. Census and created choropleth maps for each. We also considered the distance between stops and the nearest child care, healthcare, and elderly care centers for each line.
We did not develop a quantitative measure to determine which stops and lines may benefit from real-time information as we felt that this involves much subjectivity in ranking characteristics such as accessibility, frequency, and amenities, which tend to vary across passengers. Instead, we analyzed the spatial, census, and delay data progressively, with each additional step of evaluation building upon the findings of its preceding analysis, to determine our final recommendations.
Header image from Flickr: Sullivan, P. AC Transit 1213 HT. CC BY-ND 2.0