The idea
I wanted to see how well you could visualize the difference between bread recipes by using data on their ingredients. I did this by, building an interactive search interface to explore recipes.
The Data
I used the RecipeDB data (https://cosylab.iiitd.edu.in/recipedb). I cleaned the data limiting it to Bread recipes. I used Manifold Learning to reduce my 150 ingredient dimensions to a 2 dimensional representation. This is how I created the plot below. I used K-means clustering to organize the recipes into distinct groups.
The Build
I performed my data analysis work using Python and the scikit-learn and pandas libraries. I used HTML, Javascript, and the d3 library to create the searchable table and graphic.
Takeaways
Create the many interactions between the search bar, table and graphic was challenging but very rewarding. It was fun to play with interactive SVG components.
I think my choice in Dimensionality Reduction worked reasonable well, however I would have liked to explore what it could have looking like with more munging.
At the time of analysis, I did not have experience with other Clustering techniques. If I were to revisit this, I would try other algorithms such as Gaussian Mixture Models.