# Law of Large Numbers

The Law of Large Numbers states that things tend toward their mean as the number of their samples increases.

The Law of Large Numbers is a Counterexample to the Theory of Evolution, which asserts that populations of species can randomly drift away from their "mean" or normal characteristics over time in order to become a more advanced, more complex species.

## Mathematical formulation

In mathematical terms, the Law of Large Numbers establishes that the mean of n independent and identically distributed random variables approaches the expected value as n increases without bound.

More specifically, there are two laws of large numbers, the weak law of large numbers and the strong law of large numbers.

## Weak law of large numbers

The weak law of large numbers states that if $X_1, X_2, \ldots$ are independent and identically distributed random variables from a distribution with mean μ finite variance, then for each ε > 0,

$\lim_{n \to \infty} P\left[\left|\frac{X_1 + \cdots + X_n}{n} - \mu\right| > \epsilon \right] = 0$

(see [1]). In other words, if you want to ensure that the mean of an independent sample is within ε of the true population mean, you can ensure this will happen with probability arbitrarily close to (but less than) 1 simply by choosing a large enough sample size.

### Example

Consider flipping a fair coin n times. You want to empirically estimate the probability that the coin will come up heads on a given flip. Because coins have no memory, the random variables Xi (defined to be 1 if the ith flip is heads and 0 if it is tails) are independent and identically distributed. Suppose we want to be within 0.05 of the true probability of flipping heads.

If we flip the coin four times, we will get 0 heads 1/16 of the time, 1 head 1/4 of the time, 2 heads 3/8 of the time, 3 heads 1/4 of the time, and 4 heads 1/16 of the time. The only case where our estimated probability of getting heads is within 0.05 of the true value, 1/2, is when we get 2 heads. Therefore, we will get a "close enough" estimate 3/8 (37.5%) of the time.

Now consider doing 100 flips. 100 flips is large enough to use the normal approximation to the binomial distribution. The mean number of heads is 0.5 times 100, or 50. The standard deviation of the number of heads is $\sqrt{np(1-p)} = \sqrt{100(0.5)(0.5)} = 5$. Therefore, the estimated probability of flipping a head has mean 0.5 and standard deviation 0.05. Since 68% of the area of a normal distribution lies within 1 standard deviation of the mean, we have that our estimate of the probability of getting heads will be "close enough" 68% of the time.

Finally, consider doing 1 million flips. Using the normal approximation to the binomial distribution, we have that the number of heads is approximately normally distributed with mean 500,000 and standard deviation $\sqrt{1000000(0.5)(0.5)} = 500$. Thus, the estimated probability of heads is also approximately normally distributed, with mean 0.5 and standard deviation $5 \times 10^{-4}$. The probability that our estimate of the probability of getting heads will be "close enough" is now almost 100%.

## Strong law of large numbers

The strong law of large numbers states that if $X_1, X_2, \ldots$ are independent and identically distributed random variables from a distribution with mean μ, then

$P\left[\lim_{n \to \infty} \frac{X_1 + \cdots +X_n}{n} = \mu\right] = 1$

(see [2]).

This basic rule of probability is attributed to the 17th-century mathematician Jacob Bernoulli.

## References

1. Ghahramani, Saeed. Fundamentals of Probability: With Stochastic Processes. 3rd ed. Upper Saddle River (NJ): Pearson Prentice Hall, 2005. p. 487
2. Ghahramani, p. 489.