The chart tool has its own linear regression routine
separate from both LINEST in Excel and from the Analysis ToolPak (ATP)
regression tool that calls LINEST. LINEST has been improved for Microsoft
Office Excel 2003 (and the ATP regression tool has also been improved with the
improvements to LINEST). Earlier versions of both LINEST and the ATP regression
tool computed incorrect R-squared values in the case where the intercept was
set to zero. For example, the R-squared values are not correct if one of the
following cases is true:
- If the third argument to LINEST is set to FALSE rather than
set to TRUE (or omitted).
-or- - If the Constant is Zero check box is
selected for the ATP regression tool.
R-squared values in other cases do not exhibit this
problem.
The chart tool has not been improved for Office Excel 2003.
Therefore, if you want a correct R-squared value for the trend line on a chart
for any version of Excel, you must use the following workaround.
Example Usage
To illustrate the R-squared value problem, follow these steps:
- Create a blank worksheet in Excel.
- Copy the following table:
- Select cell A1 in your blank Excel worksheet, and then
click Paste on the Edit menu so that the
entries in the table below fill cells A1:B4 in your worksheet.
- Select the cell range A1:B4.
- On the Insert menu, click
Chart.
- On the Standard Types tab, click
XY (Scatter), and then click Next.
- On the Data Range tab, click
Columns, and then click Next.
- Click Finish to complete the
chart.
- With the chart selected, click Add
Trendline on the Chart menu.
- On the Type tab, click
Linear.
- On the Options tab, follow these steps:
- Click to select the Set intercept =
check box, and then verify that the Set intercept = box is set
to 0.
- Click to select the Display R-squared value on
chart check box.
- Click OK to close the Add
Trendline dialog box.
The R-squared value of minus 20.429 appears, but it is not the
correct square of a number. This problem arises from an incorrect formula for
the total sum of squares.
Use one of the following methods to find the
correct R-squared value in Office Excel 2003:
- Call LINEST on the same data (with the third argument equal
to FALSE and the fourth argument equal to TRUE). Use the R-squared value that
is in the third row, first column of the output table.
-or- - Use the ATP regression tool (with the Constant is
Zero check box selected). Use the R-squared value that is clearly
shown in it�s output.
Unfortunately, for earlier versions of Excel, the methods that
are mentioned to find the correct R-squared value will not work without you
having to revise the results. If you have an earlier version of Excel, either
call LINEST or use the ATP regression tool as suggested in the previous
paragraph, and then observe the following results:
- If you called LINEST, the Residual sum of squares appears
in the fifth row, second column of LINEST output. If you used the ATP
regression tool, the Residual sum of squares is clearly labeled in the ANOVA
table portion of the output.
- Compute total sum of squares as SUMSQ(known_y's). In the
example, that's SUMSQ(B2:B4).
- Correct R-squared = (Total sum of squares � Residual sum of
squares) / Total Sum of Squares.
Conclusions
An R-squared value that is inserted on a chart with a linear trend
line is always incorrect in the case where the
Set Intercept =
0 box is selected. (Cases with non-linear trend lines or values other
than zero for setting the intercept, or both, have not been
investigated.)
A work around has been provided to find the correct
R-squared value. If the
Set Intercept = 0 box is left
unchecked (by far the more typical case in practice), there is no problem with
the chart tool's value of R-squared.