

For the cluster analysis part of the project, we had quite a trouble because we did not have numeric values to data points in the two datasets. At the end, we figured we can use latitude and longitude for the both datasets because when latitude and longitude for each data point in the entire datasets were plotted in K-Means, the plots looked like the city of Chicago and the map of the Montgomery County.

Particularly, we observed different densities of clusters.

For the Montgomery crime dataset, the green cluster is more dense than the red cluster, which we can deduce that lots of crimes happened not far away from the centroid of the green cluster.

We can also say the same thing for the teal cluster by observing the plot; crimes were dispersed and spread out in the teal cluster.