# Non-parametric regression

## Kernel regression

### Kernel regression

#### Introduction

For parametric regression we have:

$$y=f(X)$$

Where the form of $$f(X)$$ is fixed, such as for linear regression.

For non-parametric regression we have:

$$y=m(X)$$

Where $$m(X)$$ is not fixed.

We can estimate $$m(X)$$ using kernel regression.

$$m(X)=\dfrac{\sum_{i=1}^nK_h(x-x_i)y_i}{\sum_{i=1}^nK_h(x-x_i)}$$

We know this because we have:

$$E(y|X)=\int yf(y|x)dy=\int y\dfrac{f(x,y)}{f(x)}dy$$

We then use kernel density estimation for both.

## Splines

### Multivariate Adaptive Regression Splines (MARS)

A linear model looks like:

$$\hat y =c+\sum_i x_i\theta_i$$

MARS instead produces a linear model for subsets of X.

$$\hat y =c+\sum_i B_j(x_i,a_j)\theta_i$$

Where:

• $$B_j=max(0, x_i-a_j)$$; or

• $$B_j=-max(0, a_j-x_i)$$

This is trained using a forward pass and a backward pass.

## Other

### Kernel regression

#### Quantile regression

In other supervised?

Normally we return a central estimate, commonly the mean.

Quantile regression returns an estimate of the $$i$$th quartile instead.

Goal is to find xth quartile of variance.

### Principal component regression

Do PCA on X.

Do OLS with this.

Transform parameters by reversing PCA procedure on parameters.

### Partial least squares regression

This expands on principal component regression.

Both X and Y are mapped to new spaces.