Analyzing the Language of Food on Social Media


This project investigates the data-driven connections between the language of food and multiple community characteristics, such as diabetes and overweight rate and geographic locale of authors. We collected a corpus of over three million food-related posts from Twitter. Using these tweets, we built a system for

  • Prediction of population characteristics, using tweet text:
    • overweight rate
    • diabetes rate
    • political leaning
    • geographical location
  • Visualization of patterns in the language of food, with
    • geographic heatmaps
    • temporal histograms
    • semantic wordclouds

This site provides supplementary tables and results for the prediction tasks, and access to the visualization interfaces.

Further discussion of the results and system implementation are available in the full paper.