A continuous indicator of food environment nutritional quality

Iris C. Liu [1], Kayla de la Haye [1], Andrés Abeliuk [2], Abigail L. Horn [1]

University of Southern California [1], University of Chile [2]

KDD Workshop on Data-driven Humanitarian Mapping (SIGKDD), 2021

[Paper] [Github]

Food environments can profoundly impact diet and related diseases. Effective, robust measures of food environment nutritional quality are required by researchers and policymakers investigating their effects on individual dietary behavior and designing targeted public health interventions. The most commonly used indicators of food environment nutritional quality are limited to measuring the binary presence or absence of entire categories of food outlet type, such as ‘fast-food’ outlets, which can range from burger joints to salad chains. This work introduces a summarizing indicator of restaurant nutritional quality that exists along a continuum, and which can be applied at scale to make distinctions between diverse restaurants within and across categories of food outlets. Verified nutrient data for a set of over 500 chain restaurants is used as ground-truth data to validate the approach. We illustrate the use of the validated indicator to characterize food environments at the scale of an entire jurisdiction, demonstrating how making distinctions between different shades of nutritiousness can help to uncover hidden patterns of disparities in access to high nutritional quality food.

Figure 1 illustrates the pipeline of obtaining a restaurant-level nutritional quality (RNQ) indicator.


Step (1), establishing a target restaurant menu dataset to score, may be done by accessing menu data from various software companies that maintain extensive databases of metadata on points of interest, including food outlets and their menus, such as Yelp and Foursquare.

Step (2) involves estimating the nutritional content of menu items, i.e., the levels of the macro and micronutrients based on all the composing food ingredients, given the menus and menu item-level descriptions obtained from Step (1).

Step (3) involves developing an aggregate indicator of the nutritional quality of a restaurant based on its menu offerings, the RNQ.

Restaurant-level Indicator of Nutritional Quality (RNQ)

A restaurant’s nutritional quality (RNQ) score is computed as the median of the RRR-m scores across all menu items within that restaurant, as the following:

Validation Analysis using Ground-truth Nutrient Data

We collect a ground-truth dataset Nutritionix, for validation. It consists of 38,275 menu items across 1,436 restaurant chain brands in the United States, organized at the level of menu items sold by each chain.

To create a dataset fit for RNQ scoring, we implement the data post-processing steps applied to matches from the USDA database described above.

Our validation analysis focuses on comparing RNQ scores for each restaurant brands in the Nutritionix database obtained using estimated nutrient values and the ground truth nutrient values for each menu items.


Fig 2 shows the correlation between the menu item level RRRm scores calculated using ground truth nutrient values and the estimated nutrient values

Table 1 shows the effective and ineffective item matches between he nutritionix dataset and the USDA database. Even the item name has a match, the nutrient profiles between the two databases can be different.

Table 2 shows restaurant brands and their menu items for the top three highest, bottom three lowest, and median three restaurant brands based on their estimated RNQ scores.