The is the size of the sample.

This is the smallest value in the sample.

This is the largest value in the sample.

This is the difference between the maximum and minimum.

This is the value whereby 50% of the sample can be found below the value.

The \(x\)th percentile is the value by which \(x\%\) of the values can be found below it.

This is the differnence between the \(25\)th percentile and the \(75\)th percentile.

The is the most common value in the sample.

We previously defined the population mean is defined as \(\mu=E[X]\).

The sample mean is defined as \(\bar x = \dfrac{1}{n}\sum_i x_i\).

We can subtract the mean from each entry in the sample. This will leave a new mean of \(0\). This is convenient for many calculations.

We previously defined the population variance as \(\sigma^2=E[(X-\mu)^2]\).

We define the sample variance as \(\sigma^2=\dfrac{1}{n}\sum_i(x_i-\bar x)^2\).

We can calculate this using matrices:

\(M=X-\bar x\)

\(\sigma^2=\dfrac{1}{n}M^TM\).

If \(\bar x =0\) then:

\(\sigma^2=\dfrac{1}{n}X^TX\).

We previously defined the population covariance as \(\sigma_{XY}=E[(X-\mu_X)^T(Y-\mu_Y)]\).

We define the sample covariance as \(\sigma_{XY}=\dfrac{1}{n}\sum_i(x_i-\bar x)(y_i-\bar y)\).

We can calculate this using matrices:

\(M=X-\bar x\)

\(N=Y-\bar y\)

\(\sigma_{XY}=\dfrac{1}{n}M^TN\).

\(\rho_{XY}=\dfrac{\sigma_{XY}}{\sigma_X \sigma_Y}\)

If we have \(n\) variables we can have a \(n\times n \) matrix \(\Sigma \) where:

\(\Sigma_{ij} = \sigma_{ij}=\dfrac{1}{n}(X_i-\bar x_i)^T(X_j-\bar x_j)\)

If \(\bar x = \bar y = 0\) then:

\(\sigma_{XY}=\dfrac{1}{n}X^TY\)

Here each entry is the correlation rather than the covariance.

The Pearson correlation coefficient is definited as the covariance normalised by the individual variances.

It is between \(-1\) (total negative linear correlation), \(0\) (no linear correlation) and \(1\) (total negative linear correlation).

\(p_{X,Y}=\dfrac{cov (X,Y)}{\sigma_X\sigma_Y}\)

For each of \(2\) variables we create a ranking of them.

From \(X\) and \(Y\) we then have \(R_X\) and \(R_Y\).

We then calculate the Pearson correlation coefficient between the rankings.

\(r_S=\dfrac{cov(R_X, R_Y)}{\sigma_{R_X}\sigma_{R_Y}}\)

\(\bar x_{n+1} = \dfrac{n\bar x_n+x_{n+1}}{n+1}\)

If it is centred:

\(\sigma_n^2=\dfrac{1}{n}X_n^TX_n\)

So:

\(\sigma_{n+1}^2=\dfrac{n\sigma_n^2 +x_{n+1}^tx_{n+1}}{n+1}\)

If it is centred:

\(\sigma_{XY}^n=\dfrac{1}{n}X_n^TY_n\)

So:

\(\sigma_{XY}^{n+1}=\dfrac{n\sigma^n_{XY}+x_{n+1}^ty_{n+1}}{n+1}\)

We have \(X\), and we want to split this into \(m\) different matrices.

We can do this by creating an index for each group, \(v\), where it is \(1\) if in the group and \(0\) otherwise.

We then select \(X[v]\).

Alternatively we can do \(v^TX\), however we must then trim the extra variables.