Analyses
Descriptive
Rent and Yelp (ANOVA Tests)
We found that the median rent per square foot is statistically different across DC, VA, and MD with 95% confidence and a p-value of 0.00. The average number of price level 3 (‘$$$’, or ‘pricey’) food-related places is different across areas with different rent levels with 95% confidence and a p-value of 0.046.
Figure 4. This shows the distribution of food-related places for each of the 4 rent levels. The average number of price level 3 food-related places is significantly different. (https://plot.ly/~ll950/45/price-level-3-food-businesses-by-rent-bin/)
We, however, did not find the mean price level of food-related businesses to vary across areas with different rent levels (p-value = 0.327). The mean ratings do not differ significantly (p-value = 0.224). The average numbers of food-related businesses with ratings 3 or higher do not vary across the different rent levels (p-value=0.501). The average number of price level 1 (p-value=0.879), price level 2 (p-value=0.697), and price level 4 (p-value=0.490) food-related places across areas with different rent levels are not statistically different either. The average number (p-value=0.661) and variety (p-value=0.800) of food-related businesses are not significantly different across the different rent levels.
Findings & Discussion
Descriptive
Rent and Yelp
Figure 9. This shows the median rent per square foot by zip code and also by city and state. (https://public.tableau.com/views/Proj3_3/Proj3?:embed=y&:display_count=yes)
Our data supports that the median rent per square foot is statistically different in the three states of interest, DC, VA, and MD. However, we did not find that the prices or ratings of food-related businesses differed in more or less expensive areas. One would expect places with higher rent to have higher quality or more upscale businesses but our data does not support this. The total number and variety of food-related businesses did not differ significantly either. So places with higher rent do not have significantly more or a larger variety food-related businesses.
We did find that the number of food places labelled as ‘$$$’ (or ‘pricey’) on Yelp to vary across places in different rent brackets. Zip codes with high rent appear to have many more food-related places with this price level, followed by zip codes with low-medium rent. It appears that zip codes with low rent and zip codes with medium-high rent have a similar number of pricey food-related places. We, however, did not find statistically significant results for the other price levels. So the number of food-related places with ‘$’ , ‘$$’, ‘$$$$’ do not differ significantly in high and low rent places.
Figure 10. This shows median rent per square foot by zip code, average price levels of food-related businesses by zip code and the mean ratings of food-related places by zip code. There does not appear to be any correlation between these variables. (https://public.tableau.com/views/Proj3_3/Proj3?:embed=y&:display_count=yes)
Figure 11. This shows median rent per square foot by zip code, total number of food-related businesses by zip code and variety of food-related places by zip code. There does not appear to be any correlation between these variables. (https://public.tableau.com/views/Proj3_3/Proj3?:embed=y&:display_count=yes)
One of our other findings is that businesses in areas with higher rent per square foot are reviewed more frequently. The scatterplot below shows this.
Figure 12. This shows the correlation between average review count and the median rent per square foot.
The scatterplot above shows that there is a fairly strong positive relationship between average rent price per square foot and average review count for businesses within a given zip code. This positive relationship suggests that Yelp is used more in high rent areas than it is in low rent areas.
Our explanation is that residents of high rent areas have significantly more options to choose from for businesses in the categories of coffee shops, bakeries, restaurants, bars, etc. Since they have more options, residents of these areas might use Yelp more frequently to choose a specific business to visit from the many available options. Yelp users who frequently view reviews of other businesses probably also leave many reviews of their own.
The fact that Yelp is used more frequently in high rent areas, that already have many high quality coffee shops, restaurants, and bars, may have a reinforcing effect that helps these restaurants by making them easily distinguishable from less high quality restaurants. For example, if a popular coffee shop has over 100 very positive reviews on Yelp, this will further increase the number of customers they get, expanding their business and creating further pressure for them to keep getting more positive reviews. If a less popular coffee shop in the same vicinity has only a small number of mostly negative reviews, they would likely go out of business fairly quickly, because it would be very clear from a quick comparison of the two Yelp pages which is the superior coffee shop.
Thus in high rent areas, Yelp may have the effect widening the gap between high quality businesses and less high quality business, forcing the less high quality businesses to shut down fairly quickly because they cannot afford the high rent in these areas.
There is no significant relationship between average price per square foot in a given zip code and average restaurant rating (shown below). This could be because people who review restaurants on Yelp in a given zip code probably judge this restaurant against other similar restaurants in the same zip code. This makes it difficult to use Yelp’s data to compare restaurants in different parts of the city.
Figure 13. Scatterplot of average rent per square foot and average food-related business rating.
Figure 14. Scatterplot of average rent per square foot and average food-related business price level.
There is also no significant relationship between average price per square foot and average business price level. Again, this could be because there are a range of differently priced restaurants within each zip code, and reviewers on Yelp compare each business to the surrounding businesses when rating the price level.
Figure 15. One the left, business types in high rent area and on the right, business types in low rent areas.
These two word clouds show that there is little difference between the business types in high rent areas and the business types in low rent areas. This is addressed in further detail below.
Table 3. Food-related business types found in higher rent areas.
Table 4. Food-related business types found in lower rent areas.
The first table shows the frequency of food-related business types in high rent areas, and the second table shows the frequency of types in low rent areas. There are some notable difference, but they generally look very similar. This is likely because many businesses in low rent areas had missing values (and were consequently eliminated), or were simply never added to the Yelp. In other words, the fact that convenience stores, grocery stores, and coffee shops are the three most frequent restaurant types may only mean that these are the businesses that are best represented on Yelp.
Limitations
There are several major limitations with the Yelp data - the primary problem is that it may not be representative of the businesses in the DC Metro area. Certain businesses are much more likely to appear and be reviewed on Yelp than others. We only have information about food-related businesses from Yelp which does not include every food-related business in the DC Metro area.
We also only have a small portion of median rent price per square foot data for the DC Metro area zip codes so our conclusions only apply to a portion of the area and may not be fully representative of the DC Metro area. We may have achieved more significant differences if we expanded the area of interest. That way we would have been able to compare the DC area to cities or towns further away which would be less expensive.
It is also important to note that there are many more variables to consider when looking at housing prices and rent that we did not look at such as education, employment, demographics, and crime.