Part One: Working with me through the set up of some different data sets.
using r as a calculator
variables
candy example
Get info from the class:
planned credits this semester
Get info from right here:
56 56 57 58 59 60 60 60 61 62 63 64 64 64 64
Get data imported from excel to .csv to r
Part Two: Some questions to work through.
As you work through these, save your your answers and the code you used to get your answer to an e-mail or a file that you can then e-mail to me. This will be your work for each lab.
Question One:
The data for gas prices over the past fifteen years in America can be found here: http://www.eia.gov/petroleum/gasdiesel/ .
for the section "U.S. Regular Prices (dollars per gallon)" there is a link that says "Full History". This should allow you to download an excel file with all of the gas prices.
Save this file as gas_data
Once you have that file downloaded, open, and saved correctly, you want to focus on the tab "Data 1"
Create a list in r called 'gas' with all of the gas prices from August 20, 1990 to the current date.
a) What is the length of this list?
b) What are the highest and lowest gas prices in this list?
c) At what index did the max gas price occur (don't need the date, just where in the list that gas price is)
d) In r, type:
plot(gas)
Explain to me what this graph shows. Also try:
plot(gas, pch=16, col="lightblue")
Explain to me the differences between these two graphs. For additional fun, you can change to color. Type:
colors()
to get a comprehensive list of different choice. Use these definitions to make a plot that you think looks appropriate. For super-duper additional fun, try:
plot(gas, pch=16, col="lightblue", main="Title Goes here", ylab="What Label Belongs Here?", xlab="I don't know What's Happening But I Like It")
That should satiate for a while.
e)
We now want to compare the gas prices on the east coast to that of the west cost for the years 2000-2010.
Open a new excel file. Call column A "date", column B "east_coast" and column C "west_coast"
Be specific about this, as we will be calling these by the EXACT names later on. Remember that r is all about case sensitive.
Using the information from the gas_data excel file, fill in the appropriate values.
Save this excel file, and then save it as a .csv file as well. Call them "east_west.xlsx" and "east_west.csv" respectively.
import "east_west.csv" and save it as ewgas.
f) what are the maximum and minimum prices for each set respectively?
g) what is the 50th entry in each of the lists?
h) let's make a plot. This is going to call for two different commands (type/copy the first one, hit enter, and then type/copy the second one).
plot(ewgas$east_coast, ylim=c(0,5))
points(ewgas$west_coast)
work with these two commands (start including pieces we've seen from before, like part d above). Make a graph that you think is appropriate.
i) compare and contrast the data from the east coast and the west coast. Work with stats terms you know, and explain some differences between the two.
Question 2:
The data below is a partial list. An experiment was conducted where people were given a loan of $50 and then were asked to pay the money back over time. This list is how many days it took to get all of the money back (reminders were sent to the subjects, though there were no threats of kneecap breakings or any other such thing). We have taken some of the values below. We have also excluded all of the people who either gave the money back immediately or never paid back the money:
43, 45, 53, 56, 56, 57, 58, 66, 67, 73, 74, 79, 80, 80, 81, 81, 81, 82, 83, 83, 84, 84, 88, 89, 91, 91, 92, 92, 97, 99, 99, 100, 101, 102, 102, 102, 103, 104, 107, 108, 109
a) Enter this data into r. Label the list money_return .
b) Create a new list that is all the values below 90. Do not remove these from the original data set when you do this. Call the new list under3mo .
c) Create a plot of the original data. Explain what r is trying to do and why this plot does not make any sense (or give us new information).
d) now try:
stem(money_return)
What did r create? Explain possible uses for this.
e) type:
?stem()
this should have opened the help page in r. Try to figure out how to split the stems on the graph. You may also use the internet to help you with this one.
f) finally, try:
hist(money_return)
Explain what r appears to have made.
Extension: (to be done on your own time, turned in before Monday)
I would like you to compare the gas prices from Obama's time in office to that of the Bush administration. Use methods similar to what we did as part of Question One, in a way you think best shows a specific similarity or difference. Some options might include looking at highest and lowest values, a comparison of first week in office, differences in summer vs. winter pricing, or something else that might suit your fancy.
When you e-mail this to me, put "President Comparison" in the tag line. Be sure to include:
Any .csv file that you used in your analysis
All relevant code (i.e. don't give me the parts of code that didn't work, only the ones you used to make your graph)
A paragraph or so analysis of what you are attempting to show in the graph.
We will look at these on Monday.