# 机器学习——方差、协方差与皮尔逊值

## 方差的定义

$Var(X) = E\{[X-E(X)]^2\}$

$Var(X) = \frac{\sum_{i=1}^N (X_i – \bar{X})^2}{N-1}$

## 方差的性质

$D(CX)=C^2D(X)$

$D(X+C) = D(X)$

$D(X) = E(X^2) – [E(X)]^2$

\begin{aligned} D(X) &= E\{[X-E(X)]^2\} \\ &= E\{X^2 -2XE(X) + [E(X)]^2\}\\ &= E(X^2) – 2E(X)E(X) + [E(X)]^2 \\ &= E(X^2) – [E(X)]^2 \end{aligned}

## 方差与协方差

$D(X+Y) = \frac{[(X+Y) – E(X+Y)]^2}{N}$

\begin{aligned} D(X+Y) &= \frac{1}{N}(X^2+2XY+Y^2-2(X+Y)E(X+Y)) \\ &= \frac{1}{N}(X^2+2XY+Y^2-2(X+Y)\overline{(X+Y)} + \overline{(X+Y)}^2) \\ &= \frac{1}{N}(X^2 + 2XY + Y^2 – 2(X\bar{X}+X\bar{Y}+\bar{X}Y+Y\bar{Y})+ \bar{X}^2 + 2\bar{X}\bar{Y} + \bar{Y}^2) \\ &= \frac{1}{N}((X-\bar{X})^2 + (Y-\bar{Y})^2+2(XY + \bar{X}\bar{Y} – X\bar{Y} – \bar{X}Y)) \\ &= \frac{1}{N}((X-E(X))^2 + (Y – E(Y))^2 + 2(X-E(X))(Y-E(Y))) \\ &= D(X) + D(Y) + 2E((X – E(X))(Y-E(Y))) \end{aligned}

$D(X+Y) = D(X) + D(Y) + 2E((X – E(X))(Y-E(Y)))$

$Cov(X, Y) = E((X – E(X))(Y-E(Y)))$

$p = \frac{E((X-\bar{X})(Y-\bar{Y}))}{\sqrt{D(X)}\sqrt{D(Y)}}= \frac{E((X-\bar{X})(Y-\bar{Y}))}{\sqrt{\sum (X-\bar{X})^2}\sqrt{\sum (Y-\bar{Y})^2}}$