In a recent project you saw that Excel can make trendlines. These are not only useful for utility bills, but they can also be used to analyze and project your desktop publishing company sales, expenses, customer base, paper and toner use, and so on. You will also learn about linear regression in math class in the near future (if not already). It is one method used to produce Excel's trendlines. Rather than have Excel create the trendline as it does without showing any of its work, you will make the trendline by combining a number of calculations made on your data and then deriving the slope and y-intercept of the trendline. You can use these to make any interpolation or extrapolation of the line that you might need.
This project is based on the excellent tutorial page at http://phoenix.phys.clemson.edu/tutorials/excel/regression.html. It has been modified to accommodate all the data in an entire column so that values can be added without modification of formulas. The x and y values you use should be taken from the tutorial page so that you can double check your answer. Look to the page for the formulas that are involved in the calculation.
Create a new spreadsheet file called LinearRegression. No macros are required and it will occupy just a single worksheet which needs no special name.
Label A1 x and B1 yData for the x and y data. Label C1 yTrend for the calculated trend values.
In cells E1 through N1 add labels for the calculations that will appear immediately below the labels. They are n, Sum(x), Sum(y), Sum(xy), Sum(x2), Sum(y2), Sum(x)2, Sum(y)2, m, and finally b.
Add the data points into the x and y columns from the web page referenced above. The points are (1.0, 2.6), (2.3, 2.8), (3.1, 3.1), (4.8, 4.7), (5.6, 5.1), (6.3, 5.3).
Now begin adding the calculations that will in the end produce the linear regression line. Under the label n add the formul =COUNT(A:A). This will count all the values in column A.
Under Sum(x) use the formula =SUM(A:A), which adds all the x values. Likewise, under Sum(y), use the formula =SUM(B:B) to add all the y values.
Similarly, the product of each x and y must be summed. It is possible to add a column in which every cell contains a product and sum those. However, Excel has a handy formula called SumProduct which will perform this calculation without the additional column. Under Sum(xy) enter the formula =SUMPRODUCT(A:A,B:B).
Similarly, a column can be added to calculate x2 for each x. These can then be summed. Instead, use the SumSq formula. Under Sum(x2) enter =SUMSQ(A:A). Of course the next formula is =SUMSQ(B:B).
If you have already calculated Sum(x) at F2, then Sum(x)2 is just the square of that or =F2*F2. Sum(y)2 is =G2*G2.
Now comes the slope calculation of your line. Look at the formula in the reference and see how it matches the Excel version. In M2 enter =(E2*H2-F2*G2)/(E2*I2-K2).
The itercept at N2 becomes =(G2-M2*F2)/E2.
Check the values you get against the referenced page and correct any discrepencies you find.
Give the value for m, the slope, the actual name m and the value for b, the y-intercept, the name b so that you can use the values in equations. Right click on the cells and add the names. The scope can be of the entire worksheet since there are no conflicting names.
In the yTrend column write the equation for each y value based on the x value in column A. You should begin with =m*A2+b.
Copy the formula down through the end of the y values. Your answers should be similar to the y values of the data.
The next step is to graph your results and compare them to what Excel calculates. Select the 3 x 7 block of labels, data, and trend and insert a scatter plot based on them. If you make the right selection, the data and labels will appear nicely in the chart without special configuration.
Now it's time to add Excel's trendline and see if it matches yours. Right click on the curved data line and choose Add Trendline... Make it linear, but don't close the dialog yet.
Change the Line Color to something that you can see on top of the straight, homemade trendline. Pick Solid line and then specify a bright color. When the settings are applied, you should see a brightly colored line on top of your trendline.
Right click the trendline and select Format Trendline... Just to be sure of the match check "Display Equation on chart." The slope and intercept that are displayed should match your spreadsheet values.
When finished, submit LinearRegression via email or use a USB drive to transfer it. The due date will be announced in class and published on the projects page as well as the calendar.
Points are granted in the categories below.
Spreadsheet file called LinearRegression
Label x, yData, yTrend
Labels n, Sum(x), Sum(y), Sum(xy) through b
Add the data points
=COUNT(A:A)
=SUM(A:A) and =SUM(B:B)
=SUMPRODUCT(A:A,B:B)
=SUMSQ(A:A), =SUMSQ(B:B)
=F2*F2, =G2*G2
=(E2*H2-F2*G2)/(E2*I2-K2)
=(G2-M2*F2)/E2
Check the values you get
Assign the names m and b
=m*A2+b
Copy the formula
Graph your results
Add Trendline...
Change the Line Color
Display Equation on chart