Our primary dataset outlines around 400,000 reported crime occurrences in Boston between 2015-2019:
We added various additional datasets that we used as predictor variables.
We initially built classification models based on 5 different crime types, then later revisited those categories and chose 5 new ones. We trained and tested all of our models on the "Original" 5 categories and on the "New" 5 categories.
Before establishing these categories, we filtered our dataset to contain only columns we used: