# Statistics

## Creating statistics

### Creating statistics

We take a sample from the distribution.

$$x=(x_1, x_2,...,x_n)$$

A statistic is a function on this sample.

$$S=S(x_1, x_2,...,x_n)$$.

## Moments of statistics

### Bias from single and joint estimation

#### Bias from single estimation

$$\mathbf x_i$$ and $$\mathbf z_i$$ are not independent, so we cannot estimate just $$y_i=\mathbf x_i\theta$$.

#### Bias from joint estimation

We could estimate our equation with a single ML algorithm.

$$y_i=f(\mathbf x_i, \theta) +g(\mathbf z_i) +\epsilon_i$$

For example, using LASSO.

However this would introduce bias into our estimates for $$\theta$$.

#### Bias from iterative estimation

We could iteratively estimate both $$\theta$$ and $$g(\mathbf z_i)$$.

For example iteratvely doing OLS for $$\theta$$ and random forests for $$z_i$$.

This would also introduce bias into $$\theta$$.

## Asymptotic properties of statistics

### Asymptotic distributions

$$f(\hat \theta )\rightarrow^d G$$

Where $$G$$ is some distribution.

### Asymptotic normality

Many statistics are asymptotically normally distribution.

This is a result of the central limit theorem.

For example:

$$\sqrt n S\rightarrow^d N(s, \sigma^2)$$

#### Confidence intervals for asymptotically normal statistics

We have the mean and variance, and know the distribution. This allows us to calculare confidence intervals.