Background:
Urška Sršen and Sando Mur founded Bellabeat, a high-tech company that manufactures health-focused smart products. Sršen used her background as an artist to develop beautifully designed technology that informs and inspires women around the world. Collecting data on activity, sleep, stress, and reproductive health has allowed Bellabeat to empower women with knowledge about their own health and habits. Since it was founded in 2013, Bellabeat has grown rapidly and quickly positioned itself as a tech-driven wellness company for women.
Business Task:
Analyze FitBit fitness tracker data to find out how customer are using the FitBit app and give recommendation for the Bellabeat marketing strategy.
1.Ask Phase:
First, we need find out the stakeholders, we have to take care of their requirements.
Urška Sršen: Bellabeat’s cofounder and Chief Creative Officer.
Sando Mur: Mathematician and Bellabeat’s cofounder; key member of the Bellabeat executive team.
Bellabeat marketing analytics team: A team of data analysts responsible for collecting, analyzing, and reporting data that helps guide Bellabeat’s marketing strategy.
They have four products:
Bellabeat app:The Bellabeat app provides users with health data related to their activity, sleep, stress, menstrual cycle, and mindfulness habits.
Leaf:Bellabeat’s classic wellness tracker can be worn as a bracelet, necklace, or clip.
Time:This wellness watch combines the timeless look of a classic timepiece with smart technology to track user activity, sleep, and stress.
Spring:: This is a water bottle that tracks daily water intake using smart technology to ensure that you are appropriately hydrated throughout the day.
2.Prepare Phase:
The data source used is FitBit Fitness Tracker Data. The dataset is downloaded from Kaggle where it was uploaded by Möbius.
Data Limitations:
a.The data was collected between 3/12/2016 to 5/12/2016, only 2 monthly data and already passed 7 years. The data not timely and maybe already changed.
b.Only have 30 FitBit users, the sample is too small.
Data ROCCC:
Good data source should have 5 characters: Reliable, Original, Comprehensive,Current, and Cited.
Reliable Low (small sample size)
Original Low (third party provider)
Comprehensive High (matches data from Bellabeat products)
Current Low (7 years ago)
Cited Low (third party data)
3.Process Phase:
First, I have use Rstudio for the data cleaning, data transformation, data analysis and visualization.
Also, I can use the SQL do the same things. Just visualization use Tableau.
Rstudio need to install some packages such like: 'tidyverse', 'dplyr','ggplot2','lubridate','RColorBrewer','plotrix'
Importing data into R and rename
Check the data
Cleaning data
Str(Daily_Activity) and Str(Sleep_Day), we found that there are 15 columns in Daily_Activity and 5 columns in Sleep_Day, both of them have two type of data numeric and character.
Check the missing values and unique ID in 2 dataset
Data Summary
The data no missing values.
There are 33 unique IDs we can use.
In Daily_Activity data set, have 940 records and
In Sleep_Day data set, have 413 records.
The ActivityDate type need to convert.
4.Analyze Phase:
According to the summary:
The user minimum steps 4676 per day and max steps 36019. Average of steps is 8582, this is little bit lower than CDC recommends 10,000 steps every day.
The customer minimum burns 1783 calories, max 4900 calories each day. Average of calories is 2329.
The people minimum Sedentary time is 660 minutes,and max 1440 minutes. Average of people has 738 minutes no active time. around 12.3 hours that's really not good for peoples health.
People at lease sleep once per day. The minimum sleep time is 58 minutes and max is 796 minutes.
5. Share Phase:
Active minutes Pie:
Explore the relationship between Steps and Calories
There is a positive correlation between calories and total steps. Increase daily steps could help people to burn more calories.
Explore the relationship between Weekday and Active times
We can see Monday, Tuesday and Sunday the usage are higher than other days. In my opinion, they may be so busy in the middle of week and have some activity during the Friday and Saturday. They only have three days take more exercise.
Time in Bed and Sleep time
Relationship between Sedentary and Steps
Percentage of people Over weight VS Health Weight.
weightLogInfo_merged <- read_csv("Case study2/Fitabase Data 4.12.16-5.12.16/weightLogInfo_merged.csv")
View(weightLogInfo_merged)
## CDC recommended weight BMI between 18.5 to 24.9
weightLogInfo_merged$overweight <- weightLogInfo_merged$BMI >= 24.9
summary(weightLogInfo_merged)
Id Date WeightKg WeightPounds
Min. :1.504e+09 Length:67 Min. : 52.60 Min. :116.0
1st Qu.:6.962e+09 Class :character 1st Qu.: 61.40 1st Qu.:135.4
Median :6.962e+09 Mode :character Median : 62.50 Median :137.8
Mean :7.009e+09 Mean : 72.04 Mean :158.8
3rd Qu.:8.878e+09 3rd Qu.: 85.05 3rd Qu.:187.5
Max. :8.878e+09 Max. :133.50 Max. :294.3
Fat BMI IsManualReport LogId
Min. :22.00 Min. :21.45 Mode :logical Min. :1.460e+12
1st Qu.:22.75 1st Qu.:23.96 FALSE:26 1st Qu.:1.461e+12
Median :23.50 Median :24.39 TRUE :41 Median :1.462e+12
Mean :23.50 Mean :25.19 Mean :1.462e+12
3rd Qu.:24.25 3rd Qu.:25.56 3rd Qu.:1.462e+12
Max. :25.00 Max. :47.54 Max. :1.463e+12
NA's :65
overweight
Mode :logical
FALSE:34
TRUE :33
table(weightLogInfo_merged$overweight)
FALSE TRUE
34 33
over <- paste(names(table(weightLogInfo_merged$overweight)),(table(weightLogInfo_merged$overweight)/67)*100,"%")
pie3D(table(weightLogInfo_merged$overweight), labels = over, explode = 0.3, main = "Over Weigth VS Health Weight")
After make the 3D pie chart, We can clearly find out the overweight people keep 49.25% and health people keep 50.74%. Almost same. This is a good news, and should encourage people do more excise keep health.
6. Act Phase:
Based on my analysis I have following recommendations:
1.Enhance the App function to encourage people to use app's frequency. Such as bonus points, or some alerts.
2.Try to let people keep 10,000 steps per day, this is a very useful way to keep them healthy. Especially in the middle of the week. That will be good to increase the amount of Health weight people. Actually, overweight people should be more then healthy weight people.
3.Encourage more people to use apps to track their activity. more data could help companies find out the trend and make the right decision.