Projects

Segmenting and Clustering Neighborhoods in Toronto

The neighborhood postcodes data in Toronto was scraped from Wikipedia. The data was joined with latitude and longitude data which is further used to extract neighborhood venue data using Foursquare API. The neighborhoods were later clustered based upon the most common venues in the area.

GitHub

Highlights

Data for over 100 neighborhoods in Toronto was scraped using BeautifulSoup.
Foursquare API was used to collect all the venues in a certain specified radius for a particular location.
K-Means clustering (unsupervised ML) was performed on the final data to assign cluster ids to neighborhoods based on the most common venues in the areas to identify similar localities.
5 unique clusters were identified. For e.g. Cluster 1 has neighborhoods where most venues present are restaurants/cafés.
Map was visualized using Folium to interpret clustering results.

Gallery

Processed Data

Clustered Neighborhoods: Map

Page updated

Google Sites

Report abuse