The neighborhood postcodes data in Toronto was scraped from Wikipedia. The data was joined with latitude and longitude data which is further used to extract neighborhood venue data using Foursquare API. The neighborhoods were later clustered based upon the most common venues in the area.
Data for over 100 neighborhoods in Toronto was scraped using BeautifulSoup.
Foursquare API was used to collect all the venues in a certain specified radius for a particular location.
K-Means clustering (unsupervised ML) was performed on the final data to assign cluster ids to neighborhoods based on the most common venues in the areas to identify similar localities.
5 unique clusters were identified. For e.g. Cluster 1 has neighborhoods where most venues present are restaurants/cafés.
Map was visualized using Folium to interpret clustering results.
Processed Data
Clustered Neighborhoods: Map