Standard Deviation Formulas

Want to download the Standard Deviation revision notes in PDF format?

Download →
[mathjax]

What is Standard Deviation?


As we know, random variables have few common numerical characteristic – average value, variation and standard deviation. Now we consider standard deviation of the random variable X. Formulas for calculation variation are next:

1. Population variation for random variable, which is defined by sequence \(x_{i}\) :

\(\sigma^2 = \frac{1}{n}\sum_{i=1}^n (x_{i}-\overline{x})^2\)

or for discrete random variable:

\(\sigma^2 = \frac{1}{n}\sum_{i=1}^n (x_{i}-\overline{x})^2 p_{i}\),

or for continuous random variable:

\(\sigma^2 = \int_{-\infty}^{\infty} (x-\overline{x})^2f(x)dx\)

where \(\overline{x}\) – average value of population, n – size of the population, \(p_{i}\) = P(X=\(x_{i})\) for discrete random variable,  f(x) – density for continuous random variable.

2. Sample variation (this formula most popular in statistic):

\(s^2 = \frac{1}{n-1}\sum_{i=1}^n (x_{i}-\overline{x})^2\)

From these definitions of the variations, we get two definitions of the standard deviation:
σ – population (in some statistical research – known) standard deviation, s – sample (estimated) standard deviation.

Consider a few examples for calculation of the standard deviation of the random variables.

Example 1. Consider the random variable:

Average value for random variable X is next:

\(\overline{x}= 1*0.1+2*0.2+3*0.3+4*0.4=3\)

Using this value, we can calculate variation of the random variable X:

\(\sigma^2 = \frac{1}{n}\sum_{i=1}^n (x_{i}-\overline{x})^2\)

=\({(1-3)}^2*0.1+{(2-3)}^2*0.2+{(3-3)}^2*0.3+{(4-3)}^2*0.4=1\)

From this result, we obtain that standard deviation for random variable X is σ =\(\sqrt{1}\)=1.

Example 2. Consider random variable X with density

f(x)={ \(\frac{3}{x^4},x\geq\)1;0, otherwise

As in example 1, we start from calculation of the average value

\(\overline{x}=\int_{-\infty}^{\infty}xf(x)dx=\int_{1}^{\infty} x* \frac{3}{x^4}dx=-\frac{3}{{2x}^2}|_1^\infty=\frac{3}{2}= 1.5\)

For calculation of the variation, let’s use the next formula:

\(\sigma^2=\int_{-\infty}^{\infty}x^2f(x)dx – x^{-2}\)

The first term is:

\(\int_{-\infty}^{\infty}x^2f(x)dx=\int_{1}^{\infty} x^2* \frac{3}{x^4}dx=-\frac{3}{x}|_1^\infty= 3\).

Hence, we conclude that:

\(σ^2 = 3 – {1.5}^2 =0.75\)

And standard deviation for this random variable is:

σ=\(\sqrt{0.75}\) = 0.5\(\sqrt{3}≈0.8660254\)

Example 3. Consider also calculation of the standard deviation for sample:

1, 3, 7, 8, 9, 10, 4, 9,0, -1
In this case, we must to use the following formula for sample. Calculate first average value of the sample:

\(\overline{x}\) =\(\frac{1}{10}\)(1 + 3 + 7 + 8 + 9 + 10 + 4 + 9 + 0 -1)=5

Then, standard deviation have next the value:

\(\sigma=\sqrt{\frac{1}{9}\sum_{i=1}^n(x_{i}-\overline{x})^2}=\sqrt{\frac{152}{9}}\approx4.109609\)

Now consider the main properties of standard deviation:
1. Standard deviation are nonnegative:

σ≥0

and equal 0 if, and only if, when corresponding random variable equal the constant.

2. Let X – random variable with standard deviation \(σ_{x}\) and a – constant, then standard deviation of the random variable aX equal:

\(σ_{aX} \)=\(|a|\sigma_{X}\).

3. Let X and Y, two independent random variable, with standard deviations \(σ_{X}\) and \(σ_{Y}\) respectively, then standard deviation of the sum and difference of these random variables are the same and:

\(\sigma_{x+y}^2=\sigma_{x-y}^2=\sigma_x^2+\sigma_y^2\).

4. From properties 2 and 3 we can deduce very useful property, which are very popular in statistics. Let\( X_1\),\(X_2\), …,\(X_N\) independent identically distributed random variables with same standard deviation σ, then standard deviation of the average value of  \( X_1\),\(X_2\), …,\(X_N\)is next:

\(\sigma_{\overline{x}}=\frac{\sigma}{\sqrt{N}}\),

where  \(\overline{X}=\frac{1}{N}\sum_{i=1}^NX_{i}\).

Consider one very popular applied aspect of using standard deviation for normal distribution. For this we consider normal distribution with parameters (μ, \(σ^2\)), where μ – mean of this distribution and \(σ^2\) – variance of this distribution. Density for this distribution have the next form:

\(f(x; μ, σ) = \frac{e\frac{(x-\mu)^2}{{2a}^2}}{\sqrt{2\pi\sigma}}\)

Then have the  68-95-99.7% rule for this distribution. This rule follows in mathematics notations:

P (- σ ≤ X – μ ≤ σ) ≈ 68%,
P (- 2σ ≤ X – μ ≤ 2σ) ≈ 95%,
P (- 3σ ≤ X – μ ≤ 3σ) ≈ 99.7%.

Consider the example for μ=3, σ=2. In this case region  – 2σ ≤ X – μ ≤ 2σ  for normal distributed random variable with density f(x;3, 2) we can see in the next figure:

So, by 68-95-99.7% rule, the blue area (- 2σ ≤ X – μ ≤ 2σ ) has measure of approximately 0.95 = 95%.