Topics

Variance measures the spread of data points around their mean, quantifying how much values in a distribution deviate from the expected value.

For a random variable , variance is defined as:

This represents the expected squared deviation from the mean. For discrete random variables with equally likely outcomes:

Key properties:

  • Always non-negative ()
  • for constants
  • For independent variables,

Sample Variance

In statistics, sample variance estimates population variance from a finite sample. The unbiased estimator uses bessel’s correction:

Where is the sample mean. The denominator corrects bias in the estimation.

In Python’s NumPy:

  • np.var() computes population variance (by default)
  • Set ddof=1 for sample variance (default ddof=0)
from numpy import array, var
 
M = array([
[1,2,3,4,5,6],
[1,2,3,4,5,6]])
 
col_var = var(M, ddof=1, axis=0)
row_var = var(M, ddof=1, axis=1)

Note

The choice between population and sample variance depends on whether you’re describing the entire population or estimating from a sample.