Semi-parametric regression

The Robinson estimator

Partially linear models

The Robinson estimator

Partialling out

\(y_i=\mathbf x_i\theta +g(\mathbf z_i) +\epsilon_i\)

Consider:

\(E(y_i|\mathbf z_i)=E(\mathbf x_i\theta +g(\mathbf z_i) + \epsilon_i|\mathbf z_i)\)

\(E(y_i|\mathbf z_i)=E(\mathbf x_i\theta|\mathbf z_i)+E(g(\mathbf z_i)|\mathbf z_i) + E(\epsilon_i|\mathbf z_i)\)

\(E(y_i|\mathbf z_i)=E(\mathbf x_i|\mathbf z_i)\theta+g(\mathbf z_i)\)

We can now remove the parametric part:

\(y_i-E(y_i|\mathbf z_i)=\mathbf x_i\theta +g(\mathbf z_i) + \epsilon_i - E(\mathbf x_i|\mathbf z_i)\theta -g(\mathbf z_i)\)

\(y_i-E(y_i|\mathbf z_i)=(\mathbf x_i- E(\mathbf x_i|\mathbf z_i))\theta +\epsilon_i\)

We define:

  • \(\bar y_i = y_i-E(y_i|\mathbf z_i)\)

  • \(\bar x_i = \mathbf x_i- E(\mathbf x_i|\mathbf z_i)\)

\(\bar y_i =\bar x_i \theta +\epsilon_i\)

Estimating \(\bar y_i\) and \(\bar x_i\)

So we can use OLS if we can estimate.

  • \(E(y_i|\mathbf z_i)\)

  • \(E(\mathbf x_i|\mathbf z_i)\)

We can do this with non-parametric methods.

Bias and variance of the Robinson estimator

Moments of the Robinson estimator

If IID then

\(Var (\hat \theta) =\dfrac{\sigma^2_\epsilon }{\sum_i(x_i-\hat X_i)^2}\)

Otherwise, can use GLM

What are the properties of the estimator?

\(E[\hat \theta ]=E[\dfrac{\sum_i (X_i-\hat X_i)(y_i-\hat y_i)}{\sum_i(x_i-\hat X_i)^2}]\)

Causal trees