Independent and identically distributed variables

Identically Independently Distributed variables (IID)

Convergence

IID

Identically distributed

\(x\) is identically distributed to \(y\) if:

\(\forall i (\exists x_i \rightarrow P(x_i)=P(y_i))\)

Covariance matrix of IID variables

For IID varaibles, the covariance matrix is:

\(\Sigma = \sigma^2 I\)

Lévy’s continuity theorem

Weak law of large numbers

The sample mean is:

\(\bar X_n=\dfrac{1}{n}\sum_{i=1}^nX_i\)

The variance of this is:

\(Var[\bar X_n]=Var[\dfrac{1}{n}\sum_{i=1}^nX_i]\)

\(Var[\bar X_n]=\dfrac{1}{n^2}nVar[X]\)

\(Var[\bar X_n]=\dfrac{\sigma^2}{n} \)

We know from Chebyshev’s inequality:

\(P(|X-\mu | \ge k\sigma )\le \dfrac{1}{k^2}\)

Use \(\bar X_n\) as \(X\):

\(P(|\bar X_n-\mu | \ge \dfrac{k\sigma }{\sqrt n})\le \dfrac{1}{k^2}\)

Update \(k\) so \(k:=\dfrac{k\sqrt n}{\sigma}\)

\(P(|\bar X_n-\mu | \ge k)\le \dfrac{\sigma^2}{nk^2}\)

As \(n\) increases, the chance that the sample mean lies outside a given distance from the population mean approaches \(0\).

Central limit theorem

Generalise weak law of large numbers

Characteristic function of summed IID events

\(Z=\sum_{i=1}^nY_i\)

\(\phi_Z(t)=E[e^{itZ}]\)

\(\phi_Z(t)=E[e^{it\sum_{i=1}^nY_i}]\)

\(\phi_Z(t)=E[e^{itY}]^n\)

\(\phi_Z(t)=\phi_Y(t)^n\)

Taylor series: first moments dominate with means

\(Z=\sum_{i=1}^nY_i\)

\(Y=\dfrac{X}{n}\)

\(\phi_Z(t)=\phi_Y(t)^n\)

\(\phi_Z(t)=\phi_{\dfrac{X}{n}}(t)^n\)

\(\phi_Z(t)=\phi_X(\dfrac{t}{n})^n\)

\(\phi_X(t)=1+it\mu_X -\dfrac{(\mu_X +\sigma_X^2 )t^2}{2} +\sum_{j=3}^{\infty }\dfrac{E[X^j](it)^j}{j!}\)

\(\phi_X(\dfrac{t}{n})=1+i\dfrac{t\mu_X }{n}-\dfrac{(\mu_X +\sigma_X^2 )(\dfrac{t}{n})^2}{2} +\sum_{j=3}^{\infty }\dfrac{E[X^j](i\dfrac{t}{n})^j}{j!}\)

\(\phi_X(\dfrac{t}{n})=1+i\dfrac{t\mu_X }{n}-\dfrac{(\mu_X +\sigma_X^2 )t^2}{2n^2} +\sum_{j=3}^{\infty }\dfrac{E[X^j](i\dfrac{t}{n})^j}{j!}\)

Eliminating the imaginary term

We want \(\mu \) to be \(0\).

\(Z=\sum_{i=1}^nY_i\)

\(Y=\dfrac{X-\mu_X }{n}\)

\(\phi_Y(t)=1+it\mu_Y -\dfrac{(\mu_Y +\sigma_Y^2 )t^2}{2} +\sum_{j=3}^{\infty }\dfrac{E[Y^j](it)^j}{j!}\)

\(\mu_Y =E[\dfrac{X-\mu_X }{n}] ={\mu_X -\mu_X }{n}=0\)

\(\phi_Y(t)=1-\dfrac{\sigma_Y^2t^2}{2} +\sum_{j=3}^{\infty }\dfrac{E[Y^j](it)^j}{j!}\)

\(\sigma^2_Y =E[(\dfrac{X-\mu_X }{n})^2]\)

\(\sigma^2_Y =E[\dfrac{X^2+\mu^2_X-2X\mu_X }{n^2}]\)

\(\sigma^2_Y =\dfrac{E[X^2]+E[\mu^2_X]-E[2X\mu_X] }{n^2}]\) \(\sigma^2_Y =\dfrac{E[X^2]-\mu^2_X}{n^2}]\)

\(\sigma^2_Y =\dfrac{\sigma^2_X}{n^2}\)

\(\phi_Y(t)=1-\dfrac{\sigma_X^2t^2}{2n^2} +\sum_{j=3}^{\infty }\dfrac{E[(\dfrac{X-\mu}{n})^j](it)^j}{j!}\)

\(\phi_Z(t)=\phi_Y(t)^n\)

\(\phi_Z(t)=[1-\dfrac{\sigma_X^2t^2}{2n^2} +\sum_{j=3}^{\infty }\dfrac{E[(\dfrac{X-\mu}{n})^j](it)^j}{j!}]^n\)

\(\phi_Z(t)=[1-\dfrac{\sigma_X^2t^2}{2n^2}]^n\)

Eliminating \(\sigma^2 \)

\(Z=\sum_{i=1}^nY_i\)

\(Y=\dfrac{X-\mu_X }{\sigma n}\)

\(\phi_Y(t)=1+it\mu_Y -\dfrac{(\mu_Y +\sigma_Y^2 )t^2}{2} +\sum_{j=3}^{\infty }\dfrac{E[Y^j](it)^j}{j!}\)

\(\mu_Y =E[\dfrac{X-\mu_X }{\sigma_X n}] ={\mu_X -\mu_X }{\sigma_X n}=0\)

\(\phi_Y(t)=1-\dfrac{\sigma_Y^2t^2}{2} +\sum_{j=3}^{\infty }\dfrac{E[Y^j](it)^j}{j!}\)

\(\sigma^2_Y =E[(\dfrac{X-\mu_X }{\sigma n})^2]\)

\(\sigma^2_Y =E[\dfrac{X^2+\mu^2_X-2X\mu_X }{\sigma^2 n^2}]\)

\(\sigma^2_Y =\dfrac{E[X^2]+\mu^2_X-2E[X]\mu_X }{\sigma^2 n^2}\)

\(\sigma^2_Y =\dfrac{E[X^2]-\mu^2_X}{\sigma^2 n^2}\)

\(\sigma^2_Y =\dfrac{\sigma^2_X}{\sigma^2 n^2}\)

\(\sigma^2_Y =\dfrac{1}{n^2}\)

\(\phi_Y(t)=1-\dfrac{t^2}{2n^2} +\sum_{j=3}^{\infty }\dfrac{E[(\dfrac{X-\mu}{\sigma n})^j](it)^j}{j!}\)

\(\phi_Z(t)=\phi_Y(t)^n\)

\(\phi_Z(t)=[1-\dfrac{t^2}{2n^2} +\sum_{j=3}^{\infty }\dfrac{E[(\dfrac{X-\mu}{\sigma n})^j](it)^j}{j!}]^n\)

\(\phi_Z(t)=[1-\dfrac{t^2}{2n^2}]^n\)

Preparing for exponential expansion

We know that

\([1+\dfrac{x}{n}]^n=e^x\)

As \(n \rightarrow \infty\).

With:

\(Z=\sum_{i=1}^nY_i\)

\(Y=\dfrac{X-\mu_X }{\sigma n}\)

We have:

\(\phi_Z(t)=[1-\dfrac{t^2}{2n^2}]^n\)

With:

\(Z=\sum_{i=1}^nY_i\)

\(Y=\dfrac{X-\mu_X }{\sigma \sqrt n}\)

We have:

\(\phi_Z(t)=[1-\dfrac{t^2}{2n}]^n\)

Which tends towards

\(\phi_Z(t)=e^{-\dfrac{1}{2}t^2}\)

Rescaling

The average of random variables, less their mean, and divided by their standard deviation multiplied by the square root of the sample size, follows a normal distribution as \(n\) increases.

What does this say about the actual distribution of sample averages?

\(Z=\sum_{i=1}^nY_i\)

\(Y_i=\dfrac{X_i-\mu_X }{\sigma_X \sqrt n}\)

\(\sum_{i=1}^nY_i\)

\(Y=\dfrac{X}{n}\)

Let’s create \(Q\).

\(Q=\dfrac{Z\sigma_X }{\sqrt n}+\mu_X\)

\(Q=\dfrac{(\sum_{i=1}^nY_i)\sigma_X }{\sqrt n}+\mu_X\)

\(Q=\dfrac{(\sum_{i=1}^n(\dfrac{X_i-\mu_X }{\sigma_X \sqrt n}))\sigma_X }{\sqrt n}+\mu_X\)

\(Q=\sum_{i=1}^n(\dfrac{X_i-\mu_X }{n})+\mu_X\)

\(Q=\sum_{i=1}^n(\dfrac{X_i-\mu_X }{n}+\dfrac{\mu_X}{n})\)

\(Q=\sum_{i=1}^n(\dfrac{X_i}{n})\)

This is the sample average.

\(\phi_Q(t)=\phi_{\dfrac{Z\sigma_X }{\sqrt n}+\mu_X}(t)\)

\(\phi_Q(t)=\phi_Z(\dfrac{t\sigma_X }{\sqrt n})e^{it\mu_X}\)

\(\phi_Z(\dfrac{t\sigma_X }{\sqrt n})=e^{-\dfrac{1}{2}(\dfrac{t\sigma_X }{\sqrt n})^2}\)

\(\phi_Z(\dfrac{t\sigma_X }{\sqrt n})=e^{-\dfrac{1}{2}\dfrac{t^2\sigma^2_X }{n}}\)

\(\phi_Q(t)=e^{-\dfrac{1}{2}\dfrac{t^2\sigma^2_X }{n}}e^{it\mu_X}\)

Normal distribution

We name the normal distribution this function when \(n=1\)

\(N(\mu_X, \sigma^2_X)=e^{-\dfrac{1}{2}\dfrac{t^2\sigma^2_X }{n}}e^{it\mu_X}\)

\(N(\mu_X, \sigma^2_X)=e^{-\dfrac{1}{2}t^2\sigma^2_X }e^{it\mu_X}\)

Getting the probability distribution function

\(\phi_X(t)=e^{-\dfrac{1}{2}t^2\sigma^2_X} e^{it\mu_X}\)

\(\phi_X(t)=e^{-\dfrac{1}{2}t^2\sigma^2_X}[\cos (t\mu_X )+i\sin (t\mu_X)]\)

Convergence in distribution (converge weakly)

Convergence in probability and o-notation

Introduction

Converges in probability

\(P(distance(X_n, X)>\epsilon )\rightarrow 0\)

For all \(\epsilon \).

\(X_n \rightarrow^P X\)

Little o notation

Little o notation is used to describe convergence in probability.

\(X_n=o_p(a_n)\)

mean that

\(\dfrac{X_n}{a_n}\)

Converges to \(0\) and \(n\) approaches something

Can be wrtiten:

\(\dfrac{X_n}{a_n}=o_p(1)\)

Big O notation

Big O notation is used to describe boundedness.

\(X_n=O_p(a_n)\)

means that:

If something is little o, it is big O.

Almost sure convergence

\(X_n\) converges almost surely to \(X\) if:

\(d(X_n, X)\rightarrow 0\)

Where \(d(X_n, X)\) is a distance metric.

\(X_n\rightarrow^{as} X\)