# Statistics

## Creating statistics

### Creating statistics

We take a sample from the distribution.

\(x=(x_1, x_2,...,x_n)\)

A statistic is a function on this sample.

\(S=S(x_1, x_2,...,x_n)\).

## Moments of statistics

### Bias from single and joint estimation

#### Bias from single estimation

\(\mathbf x_i\) and \(\mathbf z_i\) are not independent, so we cannot estimate just \(y_i=\mathbf x_i\theta\).

#### Bias from joint estimation

We could estimate our equation with a single ML algorithm.

\(y_i=f(\mathbf x_i, \theta) +g(\mathbf z_i) +\epsilon_i\)

For example, using LASSO.

However this would introduce bias into our estimates for \(\theta\).

#### Bias from iterative estimation

We could iteratively estimate both \(\theta\) and \(g(\mathbf z_i)\).

For example iteratvely doing OLS for \(\theta\) and random forests for \(z_i\).

This would also introduce bias into \(\theta\).

## Asymptotic properties of statistics

### Asymptotic distributions

\(f(\hat \theta )\rightarrow^d G\)

Where \(G\) is some distribution.

### Asymptotic mean and variance

### Asymptotic normality

Many statistics are asymptotically normally distribution.

This is a result of the central limit theorem.

For example:

\(\sqrt n S\rightarrow^d N(s, \sigma^2)\)

#### Confidence intervals for asymptotically normal statistics

We have the mean and variance, and know the distribution. This allows us to calculare confidence intervals.