# Semi-parametric regression

## The Robinson estimator

### The Robinson estimator

#### Partialling out

$$y_i=\mathbf x_i\theta +g(\mathbf z_i) +\epsilon_i$$

Consider:

$$E(y_i|\mathbf z_i)=E(\mathbf x_i\theta +g(\mathbf z_i) + \epsilon_i|\mathbf z_i)$$

$$E(y_i|\mathbf z_i)=E(\mathbf x_i\theta|\mathbf z_i)+E(g(\mathbf z_i)|\mathbf z_i) + E(\epsilon_i|\mathbf z_i)$$

$$E(y_i|\mathbf z_i)=E(\mathbf x_i|\mathbf z_i)\theta+g(\mathbf z_i)$$

We can now remove the parametric part:

$$y_i-E(y_i|\mathbf z_i)=\mathbf x_i\theta +g(\mathbf z_i) + \epsilon_i - E(\mathbf x_i|\mathbf z_i)\theta -g(\mathbf z_i)$$

$$y_i-E(y_i|\mathbf z_i)=(\mathbf x_i- E(\mathbf x_i|\mathbf z_i))\theta +\epsilon_i$$

We define:

• $$\bar y_i = y_i-E(y_i|\mathbf z_i)$$

• $$\bar x_i = \mathbf x_i- E(\mathbf x_i|\mathbf z_i)$$

$$\bar y_i =\bar x_i \theta +\epsilon_i$$

#### Estimating $$\bar y_i$$ and $$\bar x_i$$

So we can use OLS if we can estimate.

• $$E(y_i|\mathbf z_i)$$

• $$E(\mathbf x_i|\mathbf z_i)$$

We can do this with non-parametric methods.

### Bias and variance of the Robinson estimator

robinson: can’t have confounded in dummy. but can in real. general result of propensity stuff?

Framing: Partialling out is an alternative to OLS where $$n<<p$$ doesn’t hold. alterntive to LASSO etc

$$\hat \theta \approx N(\theta, V/n)$$

$$V=(E[\hat D^2)^{-1}E[\hat D^2\epsilon^2 ](E[\hat D^2])^{-1}$$

These are robust standard errors.

#### Moments of the Robinson estimator

If IID then

$$Var (\hat \theta) =\dfrac{\sigma^2_\epsilon }{\sum_i(x_i-\hat X_i)^2}$$

Otherwise, can use GLM

What are the properties of the estimator?

$$E[\hat \theta ]=E[\dfrac{\sum_i (X_i-\hat X_i)(y_i-\hat y_i)}{\sum_i(x_i-\hat X_i)^2}]$$

### Non-linear treatment effects in the Robinson estimator

Page on reformulating as non-linear. can do it. show can be estimated using arg min https://arxiv.org/pdf/1712.04912.pdf

### DML

in DML. page on orthogonality scores, page on constructing them; page on using them to estimate parameters (GMM)

We have $$P(X)=f(\theta , \rho)$$ $$\hat \theta = f(X, n)$$ $$\theta = g(\rho , X)$$

So error is: $$\hat \theta - \theta=f(X, n)-g(\rho , X)$$

Bias is defined as: $$Bias(\hat \theta, \theta ) = E[\hat \theta - \theta]=E[\hat \theta ] - \theta$$ $$Bias = E[\hat \theta - \theta]=E[f(X, n)-g(\rho , X)]$$ $$Bias = E[\hat \theta - \theta]=E[f(X, n)]-g(\rho ,X)$$

double ML: regression each parametric parameter on ML of other variables. eg: get $$e(x|z)$$ $$e(d|x)$$ $$d=m(x)+v$$ $$d$$ is correlated with $$x$$ so bias. $$v$$ is corrleated with $$d$$ but not $$x$$. use as “iv”. Still need estimate for $$g(x)$$.

for iterative, process is: + estimate $$g(x)$$ + plug into other and estimate theta + this section should be in sample splitting. rename iterative estimation. separate pages for bias, variance + how does this work?? paper says random forest regression and OLS. intialise $$\theta$$ randomly? + page on bias, variance, efficiency? + page on sample splitting, why?

+ page on goal: $$x$$ and $$z$$ orthogonal for split sampling + page on $$X=m_0(Z)+\mu$$, first stage machine learning, synthetic instrumental variables? h3 on that for multiple variables on interest. regression for each

### DML1

Divide into $$k$$.

For each do ML on nuicance (how???) use all instances outside of sample

Then do GMM using orthogonality condition to calculate $$\theta$$. (how??) use instances in sample

Average $$\theta$$ from each class

### Last stage Robinson

Separate page for last stage: note we can do OLS, GLS etc with choice of $$\Omega$$.