# The central limit theorem and the gaussian/normal distribution

## Central limit theorem

### Central limit theorem

#### Characteristic function of summed IID events

$$Z=\sum_{i=1}^nY_i$$

$$\phi_Z(t)=E[e^{itZ}]$$

$$\phi_Z(t)=E[e^{it\sum_{i=1}^nY_i}]$$

$$\phi_Z(t)=E[e^{itY}]^n$$

$$\phi_Z(t)=\phi_Y(t)^n$$

#### Taylor series: first moments dominate with means

$$Z=\sum_{i=1}^nY_i$$

$$Y=\dfrac{X}{n}$$

$$\phi_Z(t)=\phi_Y(t)^n$$

$$\phi_Z(t)=\phi_{\dfrac{X}{n}}(t)^n$$

$$\phi_Z(t)=\phi_X(\dfrac{t}{n})^n$$

$$\phi_X(t)=1+it\mu_X -\dfrac{(\mu_X +\sigma_X^2 )t^2}{2} +\sum_{j=3}^{\infty }\dfrac{E[X^j](it)^j}{j!}$$

$$\phi_X(\dfrac{t}{n})=1+i\dfrac{t\mu_X }{n}-\dfrac{(\mu_X +\sigma_X^2 )(\dfrac{t}{n})^2}{2} +\sum_{j=3}^{\infty }\dfrac{E[X^j](i\dfrac{t}{n})^j}{j!}$$

$$\phi_X(\dfrac{t}{n})=1+i\dfrac{t\mu_X }{n}-\dfrac{(\mu_X +\sigma_X^2 )t^2}{2n^2} +\sum_{j=3}^{\infty }\dfrac{E[X^j](i\dfrac{t}{n})^j}{j!}$$

#### Eliminating the imaginary term

We want $$\mu$$ to be $$0$$.

$$Z=\sum_{i=1}^nY_i$$

$$Y=\dfrac{X-\mu_X }{n}$$

$$\phi_Y(t)=1+it\mu_Y -\dfrac{(\mu_Y +\sigma_Y^2 )t^2}{2} +\sum_{j=3}^{\infty }\dfrac{E[Y^j](it)^j}{j!}$$

$$\mu_Y =E[\dfrac{X-\mu_X }{n}] ={\mu_X -\mu_X }{n}=0$$

$$\phi_Y(t)=1-\dfrac{\sigma_Y^2t^2}{2} +\sum_{j=3}^{\infty }\dfrac{E[Y^j](it)^j}{j!}$$

$$\sigma^2_Y =E[(\dfrac{X-\mu_X }{n})^2]$$

$$\sigma^2_Y =E[\dfrac{X^2+\mu^2_X-2X\mu_X }{n^2}]$$

$$\sigma^2_Y =\dfrac{E[X^2]+E[\mu^2_X]-E[2X\mu_X] }{n^2}]$$ $$\sigma^2_Y =\dfrac{E[X^2]-\mu^2_X}{n^2}]$$

$$\sigma^2_Y =\dfrac{\sigma^2_X}{n^2}$$

$$\phi_Y(t)=1-\dfrac{\sigma_X^2t^2}{2n^2} +\sum_{j=3}^{\infty }\dfrac{E[(\dfrac{X-\mu}{n})^j](it)^j}{j!}$$

$$\phi_Z(t)=\phi_Y(t)^n$$

$$\phi_Z(t)=[1-\dfrac{\sigma_X^2t^2}{2n^2} +\sum_{j=3}^{\infty }\dfrac{E[(\dfrac{X-\mu}{n})^j](it)^j}{j!}]^n$$

$$\phi_Z(t)=[1-\dfrac{\sigma_X^2t^2}{2n^2}]^n$$

Eliminating $$\sigma^2$$

$$Z=\sum_{i=1}^nY_i$$

$$Y=\dfrac{X-\mu_X }{\sigma n}$$

$$\phi_Y(t)=1+it\mu_Y -\dfrac{(\mu_Y +\sigma_Y^2 )t^2}{2} +\sum_{j=3}^{\infty }\dfrac{E[Y^j](it)^j}{j!}$$

$$\mu_Y =E[\dfrac{X-\mu_X }{\sigma_X n}] ={\mu_X -\mu_X }{\sigma_X n}=0$$

$$\phi_Y(t)=1-\dfrac{\sigma_Y^2t^2}{2} +\sum_{j=3}^{\infty }\dfrac{E[Y^j](it)^j}{j!}$$

$$\sigma^2_Y =E[(\dfrac{X-\mu_X }{\sigma n})^2]$$

$$\sigma^2_Y =E[\dfrac{X^2+\mu^2_X-2X\mu_X }{\sigma^2 n^2}]$$

$$\sigma^2_Y =\dfrac{E[X^2]+\mu^2_X-2E[X]\mu_X }{\sigma^2 n^2}$$

$$\sigma^2_Y =\dfrac{E[X^2]-\mu^2_X}{\sigma^2 n^2}$$

$$\sigma^2_Y =\dfrac{\sigma^2_X}{\sigma^2 n^2}$$

$$\sigma^2_Y =\dfrac{1}{n^2}$$

$$\phi_Y(t)=1-\dfrac{t^2}{2n^2} +\sum_{j=3}^{\infty }\dfrac{E[(\dfrac{X-\mu}{\sigma n})^j](it)^j}{j!}$$

$$\phi_Z(t)=\phi_Y(t)^n$$

$$\phi_Z(t)=[1-\dfrac{t^2}{2n^2} +\sum_{j=3}^{\infty }\dfrac{E[(\dfrac{X-\mu}{\sigma n})^j](it)^j}{j!}]^n$$

$$\phi_Z(t)=[1-\dfrac{t^2}{2n^2}]^n$$

#### Preparing for exponential expansion

We know that

$$[1+\dfrac{x}{n}]^n=e^x$$

As $$n \rightarrow \infty$$.

With:

$$Z=\sum_{i=1}^nY_i$$

$$Y=\dfrac{X-\mu_X }{\sigma n}$$

We have:

$$\phi_Z(t)=[1-\dfrac{t^2}{2n^2}]^n$$

With:

$$Z=\sum_{i=1}^nY_i$$

$$Y=\dfrac{X-\mu_X }{\sigma \sqrt n}$$

We have:

$$\phi_Z(t)=[1-\dfrac{t^2}{2n}]^n$$

Which tends towards

$$\phi_Z(t)=e^{-\dfrac{1}{2}t^2}$$

#### Rescaling

The average of random variables, less their mean, and divided by their standard deviation multiplied by the square root of the sample size, follows a normal distribution as $$n$$ increases.

What does this say about the actual distribution of sample averages?

$$Z=\sum_{i=1}^nY_i$$

$$Y_i=\dfrac{X_i-\mu_X }{\sigma_X \sqrt n}$$

$$\sum_{i=1}^nY_i$$

$$Y=\dfrac{X}{n}$$

Letâ€™s create $$Q$$.

$$Q=\dfrac{Z\sigma_X }{\sqrt n}+\mu_X$$

$$Q=\dfrac{(\sum_{i=1}^nY_i)\sigma_X }{\sqrt n}+\mu_X$$

$$Q=\dfrac{(\sum_{i=1}^n(\dfrac{X_i-\mu_X }{\sigma_X \sqrt n}))\sigma_X }{\sqrt n}+\mu_X$$

$$Q=\sum_{i=1}^n(\dfrac{X_i-\mu_X }{n})+\mu_X$$

$$Q=\sum_{i=1}^n(\dfrac{X_i-\mu_X }{n}+\dfrac{\mu_X}{n})$$

$$Q=\sum_{i=1}^n(\dfrac{X_i}{n})$$

This is the sample average.

$$\phi_Q(t)=\phi_{\dfrac{Z\sigma_X }{\sqrt n}+\mu_X}(t)$$

$$\phi_Q(t)=\phi_Z(\dfrac{t\sigma_X }{\sqrt n})e^{it\mu_X}$$

$$\phi_Z(\dfrac{t\sigma_X }{\sqrt n})=e^{-\dfrac{1}{2}(\dfrac{t\sigma_X }{\sqrt n})^2}$$

$$\phi_Z(\dfrac{t\sigma_X }{\sqrt n})=e^{-\dfrac{1}{2}\dfrac{t^2\sigma^2_X }{n}}$$

$$\phi_Q(t)=e^{-\dfrac{1}{2}\dfrac{t^2\sigma^2_X }{n}}e^{it\mu_X}$$

#### Normal distribution

We name the normal distribution this function when $$n=1$$

$$N(\mu_X, \sigma^2_X)=e^{-\dfrac{1}{2}\dfrac{t^2\sigma^2_X }{n}}e^{it\mu_X}$$

$$N(\mu_X, \sigma^2_X)=e^{-\dfrac{1}{2}t^2\sigma^2_X }e^{it\mu_X}$$

#### Getting the probability distribution function

$$\phi_X(t)=e^{-\dfrac{1}{2}t^2\sigma^2_X} e^{it\mu_X}$$

$$\phi_X(t)=e^{-\dfrac{1}{2}t^2\sigma^2_X}[\cos (t\mu_X )+i\sin (t\mu_X)]$$

## Convergence

### Convergence in probability and o-notation

#### Introduction

Converges in probability

$$P(distance(X_n, X)>\epsilon )\rightarrow 0$$

For all $$\epsilon$$.

$$X_n \rightarrow^P X$$

#### Little o notation

Little o notation is used to describe convergence in probability.

$$X_n=o_p(a_n)$$

mean that

$$\dfrac{X_n}{a_n}$$

Converges to $$0$$ and $$n$$ approaches something

Can be wrtiten:

$$\dfrac{X_n}{a_n}=o_p(1)$$

#### Big O notation

Big O notation is used to describe boundedness.

$$X_n=O_p(a_n)$$

means that:

If something is little o, it is big O.

### Almost sure convergence

$$X_n$$ converges almost surely to $$X$$ if:

$$d(X_n, X)\rightarrow 0$$

Where $$d(X_n, X)$$ is a distance metric.

$$X_n\rightarrow^{as} X$$

## Gaussian distributions

### Gaussian

$$f_x=\dfrac{1}{\sqrt {2\pi \sigma^2 }} e^{-\dfrac{(x-\mu)^2}{2\sigma }}$$

### Multivariable Gaussian distribution

#### Definition

For univariate:

$$x \sim N(\mu, \sigma^2 )$$

We define the multivariate gaussian distribution as the distribution where any linear combination of components are gaussian.

For multivariate:

$$X \sim N(\mu, \Sigma )$$

Where $$\mu$$ is now a vector, and $$\Sigma$$ is the covariance matrix.

Density function is :

$$f_x=\dfrac{1}{\sqrt {(2\pi )^n|\Sigma |}} e^{-\dfrac{1}{2}(x-\mu )^T\Sigma^{-1}(x-\mu)}$$

For normal gaussian it is:

$$f_x=\dfrac{1}{\sqrt {2\pi |\sigma^2}} e^{-\dfrac{1}{2\sigma^2}(x-\mu )^2)}$$

This is the same wher $$n=1$$.

#### Singular Gaussians

Need det $$|\Sigma |$$ and $$\Sigma^{-1}$$. These rely on the covariance matrix not being degenerate.

If the covariance matrix is degenerate we can instead use the pseudo inverse, and the pseudo determinant.