VARP returns the population variance for a population whose
values are contained in an Excel worksheet and are specified by the argument or
arguments to VARP.
Syntax
VARP(value1, value2, value3, ...)
where value1, value2, ..., up to 30 value arguments.
The
most common usage of VARP includes only 1 value argument that specifies a range
of cells that contains the population, for example, VARP(A1:B100).
Example of usage
Create a blank Excel worksheet, and copy the following table. Select cell
A1 in your blank Excel worksheet, and then paste the entries so that the table fills cells
A1:D17 in your worksheet.
Collapse this tableExpand this table
Data | | | |
| | | |
6 | | sample mean: | =AVERAGE(A3:A8) |
4 | | sample size: | =COUNT(A3:A8) |
2 | | VARP: | =VARP(A3:A8) |
1 | | pre-Excel 2003 VARP
v1: | =(D4*SUMSQ(A3:A8) - (SUM(A3:A8)^2))/(D4^2) |
3 | | pre-Excel 2003 VARP
v2: | =(SUMSQ(A3:A8) - (SUM(A3:A8)^2)/D4)/D4 |
5 | | VARP in Excel 2003 and in later versions of Excel: | =DEVSQ(A3:A8)/D4 |
| | | |
Modified Data | | Power of 10 to add to
data: | 1 |
| | | |
=A3 + 10^$D$10 | | sample
mean: | =AVERAGE(A12:A17) |
=A4 + 10^$D$10 | | sample
size: | =COUNT(A12:A17) |
=A5 +
10^$D$10 | | VARP: | =VARP(A12:A17) |
=A6 + 10^$D$10 | | pre-Excel 2003 VARP
v1: | =(D13*SUMSQ(A12:A17) - (SUM(A12:A17)^2))/(D13^2) |
=A7 + 10^$D$10 | | pre-Excel 2003 VARP
v2: | =(SUMSQ(A12:A17) - (SUM(A12:A17)^2)/D13)/D13 |
=A8 + 10^$D$10 | | VARP in Excel 2003 and in later versions of Excel: | =DEVSQ(A12:A17)/D13 |
After you paste this table into your new Excel worksheet, click
the
Paste Options button, and then click
Match
Destination Formatting. With the pasted range still selected, use one of the following procedures, as appropriate for the version of Excel that you are running:
- In Microsoft Office Excel 2007, click the Home tab, click Format in the Cells group, and then click AutoFit Column Width.
- In Excel 2003, point to
Column on the Format menu, and then click
AutoFit Selection.
Cell D5 contains the value of VARP
for your current version of Excel. If you use Excel 2003 or a later version of Excel, this value should
agree with the value in cell D8. Cell D8 shows the value of VARP in Excel 2003 and in later versions of Excel
(regardless of the version of Excel that you are using). Cells D6 and D7 show
two approximations to the value of VARP that was computed by earlier versions
of Excel. The formula in cell D6 is the formula that is shown in the Help file
for Microsoft Excel 2002 and earlier versions of Excel.
In this example, all versions return the
value 2.9166667. There are no computational problems here that cause
differences in VARP among the versions of Excel.
You can use rows 10 to
17 to experiment with modified data by adding a constant (in this case a power
of 10 is added) to each data point. It is well-known that adding a constant to
each data point will have no affect on the value of population
variance.
If you change the value in cell D10 (for example: to 1, 2,
3, 4, 5, 6, or 7), you can see the revised data values in cells A12:A17, and
you can also see that all versions of VARP are well-behaved in these 7
cases.
However, if you continue the experiment to try the values 8, 9, and
10 in cell D10, you will notice that the value for Excel 2003 and for later versions of Excel remains at 2.9166667
(as it should), while the values for Excel 2002 and for earlier versions of Excel change
(even though they should remain constant at 2.9166667). This would not occur if
computations could be done with infinite precision.
Earlier versions
of Excel exhibit wrong answers in these cases because the effects of round-off
errors are more profound with the computational formula that is used by these versions of Excel.
Still, the cases that are used in this experiment could be viewed as rather extreme.
Results in earlier versions of Excel
In extreme cases where there are many significant digits in the
data but a small variance, the old computational formula
leads to inaccurate results. Earlier versions of Excel used a single pass
through the data to compute the sum of squares of the data values, the sum of
the data values, and the count of the data values (sample size). These
quantities were then combined into the computational formula that is specified in the
Help file in earlier versions of Excel.
Results in Excel 2003 and in later versions of Excel
The procedure that is used in Excel 2003 and in later versions of Excel uses a two-pass process
through the data. First, the sum and count of the data values are computed and
from these the sample mean (average) can be computed. Then, on the second pass,
the squared difference between each data point and the sample mean is found and
these squared differences are summed.
In the numeric examples, even with a high
power of 10 in cell D10, these squared differences are not affected and the
results of the second pass are independent of the entry in cell D10. Therefore,
the results in Excel 2003 and in later versions of Excel are more stable numerically.
Conclusions
Replacing a one-pass approach by a two-pass approach guarantees
better numeric performance of VARP in Excel 2003 and in later versions of Excel. Results in Excel 2003 and in later versions of Excel will never
be less accurate than results in earlier versions of Excel.
In most practical
examples, however, you are not likely to see a difference between the results in later versions of Excel and the results in earlier versions of Excel. This is
because typical data is unlikely to exhibit the kind of unusual behavior that
this experiment illustrates.
Numeric instability is most likely to appear in
earlier versions of Excel when data contains a high number of significant
digits combined with relatively little variation between data
values.
If you use an earlier version of Excel, and if you want to determine whether
switching to Excel 2003 or to a later version of Excel will make a difference, compare the results of
with the results of
DEVSQ(values)/COUNT(values)
If the results are consistent with the accuracy that you want, then
switching to Excel 2003 or a later version of Excel will not affect the value of VAR.
If you use
Excel 2003 or a later version of Excel, and if you want to determine whether the computed value of VARP(values) has changed
from the value that would have been found when you used an earlier version of
Excel, compare
with
(SUMSQ(values) - (SUM(values)^2)/COUNT(values))/COUNT(values)
This comparison gives at least a good approximation of the value
of VARP as found by earlier versions of Excel.
The procedure of
finding the sum of squared deviations about a mean (average) by
finding the mean, by computing each squared deviation, and by summing the squared deviations is more accurate than the alternative procedure. The alternative procedure is frequently
named the "calculator formula" because it was suitable for the use of a
calculator on a small number of data points. The calculator formula uses the following procedures:
- Find the sum of squares of all observations, the number of
observations, and the sum of all observations.
- Compute the sum of squares of all observations minus ((sum
of all observations)^2)/number of observations).
There are many other functions that have been improved for Excel 2003 and for later versions of Excel. These functions were improved by replacing the one-pass procedure with the two-pass procedure that
finds the mean on the first pass and then computes the sum of squared deviations
about the mean on the second pass.
A short list of such functions includes
VAR, VARP, STDEV, STDEVP, DVAR, DVARP, DSTDEV, DSTDEVP, FORECAST, SLOPE,
INTERCEPT, PEARSON, RSQ, and STEYX. Similar improvements were made in each of
the three Analysis of Variance tools in the Analysis
ToolPak.