# Summary statistics and visualisation for one variable

## Basis statistics for a single variable

### N

The is the size of the sample.

### Sample range

#### Minimum

This is the smallest value in the sample.

#### Maximum

This is the largest value in the sample.

#### Range

This is the difference between the maximum and minimum.

This is the value whereby 50% of the sample can be found below the value.

#### Percentiles

The \(x\)th percentile is the value by which \(x\%\) of the values can be found below it.

#### Interquartile range

This is the differnence between the \(25\)th percentile and the \(75\)th percentile.

### Sample mode

The is the most common value in the sample.

## Sample moments

### Sample mean

We previously defined the population mean is defined as \(\mu=E[X]\).

The sample mean is defined as \(\bar x = \dfrac{1}{n}\sum_i x_i\).

#### Centred mean

We can subtract the mean from each entry in the sample. This will leave a new mean of \(0\). This is convenient for many calculations.

### Sample variance

We previously defined the population variance as \(\sigma^2=E[(X-\mu)^2]\).

We define the sample variance as \(\sigma^2=\dfrac{1}{n}\sum_i(x_i-\bar x)^2\).

We can calculate this using matrices:

\(M=X-\bar x\)

\(\sigma^2=\dfrac{1}{n}M^TM\).

#### Centred variance

If \(\bar x =0\) then:

\(\sigma^2=\dfrac{1}{n}X^TX\).

## Other

### Standard error

### Standard deviation

### Sample size

## Updating statistics

### Updating the mean

\(\bar x_{n+1} = \dfrac{n\bar x_n+x_{n+1}}{n+1}\)

### Updating the variance

If it is centred:

\(\sigma_n^2=\dfrac{1}{n}X_n^TX_n\)

So:

\(\sigma_{n+1}^2=\dfrac{n\sigma_n^2 +x_{n+1}^tx_{n+1}}{n+1}\)

## Visualising a single continous variable

### Box and whisker plots

### Density plot