Topics
Variance measures the spread of data points around their mean. If we are given a vector and asked to find the variance, we can do the following:
from numpy import array, var
M = array([1,2,2,2,5,6])
n = len(M)
M_centered = M - M.mean()
dot_prod = M_centered @ M_centered
variance = dot_prod / (n - 1)
print(variance) # 4.0
# confirming
print(var(M, ddof=1)) # 4.0
So basically,
- center the vector (around the mean) and get
- calc inner product:
- variance:
Additionally, covariance, between 2 vectors and is found in the same way:
- center and around their respective means
- calc inner product between and