We later chose 5 new categories to classify as crime types. See the motivation for choosing new categories and for using those specific categories.
The 5 new categories are: Landlord/Tenant Disputes, Verbal Disputes, Prostitution, Aggravated Assault, and Homicide. We filtered our new dataset to include only those crimes:
We used our haversine function to calculate distances between our predictors and the crimes in our new dataset. See the Distance Calculations page for details.
After calculating all of the relevant distances, we added each predictor to our dataset:
We also split our data into train/test sets, and encoded our crime types as integers from 0-4.
Data Exploration for Original Categories