Faculty Collaborator: Frank Donnelly
About:
Felicity has been working with Frank Donnelly, head of GeoData@SciLi, to create a value-added dataset of crime offenses in Providence. The Providence Police Department publishes data on crime incidents from the last 180 days, including general descriptions of the locations where these offenses occurred. The project has three primary goals:
Geocode the data to visualize the crimes as points on a map.
Categorize offenses by type.
Develop a process to seamlessly add new data every 180 days.
Felicity has implemented this project using a combination of QGIS software and Python.
Project Overview
Goal: Create a process for dataset creation and updates.
Original Dataset:
Providence crime logs.
Publicly available from Providence PD.
Inconsistent location formats (blocks, intersections, landmarks).
Last 180 days of data.
New Dataset:
Geocoded version.
Each crime assigned precise (latitude, longitude) coordinates.
Historical data available by year.
Crimes classified by type.
Analysis of where certain types of crime have occurred over time.
Project Timeline
Establish Deliverables
Process the Data
Geocode the Data
Analyze Results
Categorize Data
Create Updating Process and Documentation
Block Locations
Challenge: How to find lat/long for block locations?
Imprecise
Not API-friendly
Don’t correspond to physical blocks
Solution: Use E911 sites
Gather sites with matching street names and address numbers.
Sort sites by address number.
Average coordinates for highest and lowest address numbers.
E911 Sites
(No detailed content provided in the slides)
Intersections & Landmarks
Which API to Use?
Open Street Map (Nominatim)
RIDOT
Census Geocoder
Resources
Python Libraries:
Pandas
Geopandas
Geopy
Tools:
Jupyter
QGIS
Data Sources:
Providence GIS Hub: Providence GIS base data package.
RIGIS: E911 sites, API for intersections & landmarks.
Classification
Classification Scheme:
Types of violent crime.
Types of property crime.
Application: How will we apply this scheme to our data?
Technical Challenges
Geocoding:
Locations that don’t exist.
Inconsistent location formats.
Varying coordinate systems.
Incorrect results.
Categorization:
New offense descriptions.
NAD83 vs. WGS84 coordinate systems.
Personal Challenges
Inexperience: Working with projects of this size.
Unfamiliarity: With geocoding and GIS.
Open-Ended Project Nature: Adapting to the undefined scope and requirements.