# Instrumental Variables and the General Method of Moments

## Parameter estimation for simultaneous equations

### Parameter identification problem with simultaneous equations

#### Identification terminology

A system is under-identified if there are not enough estimators for all structural parameters.

A system is exactly identified if there are the same number of estimators as structural parameters.

A system is over-identified if there are more estimators than structural parameters.

In general we have in our structural form:

$$\sum^n_i\beta_{ij}y_i=\sum^m_i\gamma_{ij}x_i+\epsilon_j$$

This is a system with $$n$$ endogeneous variables and $$m$$ exogeneous variables.

We can write this in matrix form.

$$B\mathbf y =\Gamma \mathbf{x} + \mathbf{\epsilon}$$

We can use this to get:

$$\mathbf{y} =B^{-1}\Gamma \mathbf{x} + B^{-1}\mathbf{ \epsilon}$$

We estimate by placing restrictions on $$\Gamma$$.

#### Strucutral models

If our data generating process is:

$$Q=\alpha + \beta P +\epsilon$$

We can estimate $$\alpha$$and $$\beta$$ through measuring $$P$$ and $$Q$$.

If, however the data generating process involves simulataneous equations, we can have:

$$Q=\alpha_1 + \beta_1 P + \epsilon_1$$

$$Q=\alpha_2 + \beta_2 P + \epsilon_2$$

#### Reduced form

We can reduce this:

$$\alpha_1 + \beta_1 P + \epsilon_1 =\alpha_2 + \beta_2 P + \epsilon_2$$

$$(\alpha_1 -\alpha_2 )+ (\beta_1 -\beta_2 )P + (\epsilon_1 -\epsilon_2 )=0$$

$$P =\dfrac{\alpha_2-\alpha_1 }{\beta_1-\beta_2}+\dfrac{\epsilon_2-\epsilon_1 }{\beta_1-\beta_2}$$

We can rewrite this as:

$$P=\pi_1 + \tau_1$$

Similarly we can reduce for $$Q$$:

$$Q =\dfrac{\alpha_2\beta_1-\alpha_1\beta_2 }{\beta_1-\beta_2}+\dfrac{\beta_1\epsilon_2 -\beta_2\epsilon_1}{\beta_1-\beta_2}$$

$$Q= \pi_2 + \tau_2$$

#### We can’t directly estimate structural models

If $$P$$ is correlated with $$epsilon_1$$ or $$\epsilon_2$$ then our estimates for $$\beta_1$$ and $$\beta_2$$ will be biased.

This also affects $$Q$$.

From the reduced forms we can see that $$P$$ will be correlated, due to simultaneity.

#### The identification problem

We can estimate $$\pi_1$$ and $$\pi_2$$, but this does not allow us to identify any of the structural parameters.

We have $$2$$ estimators, but $$4$$ parameters.

This is the identification problem.

## 2 Stage OLS

### 2 Stage OLS (2SOLS) estimator

#### Motivation

If $$x$$ is correlated with the error term the OLS estimate will be biased.

#### 2 Stage OLS - first stage

We have

$$y_i=x_i \theta + \epsilon_i$$

$$x_i=z_i \rho +\mu_i$$

We do OLS on the second to get $$\hat \rho$$.

$$\hat \rho =(Z^TZ)^{-1}Z^TX$$

We use this to get predicted values of $$X$$.

$$\hat X=Z\rho =Z(Z^TZ)^{-1}Z^TX = P_ZX$$

#### 2 Stage OLS - second stage

We then regress $$y$$ on the estimated $$X$$:

$$y_i=\hat x_i\theta +\epsilon_i$$

Our prediction is then:

$$\hat {\theta_{2SOLS}} = (\hat {X^T}\hat X)^{-1}\hat {X^T}y)$$

$$\hat {\theta_{2SOLS}} = ((P_ZX)^TP_ZX)^{-1}(P_ZX)^Ty)$$

$$\hat {\theta_{2SOLS}} = (X^TP_ZX)^{-1}X^TP_Zy)$$

If the dimension of $$Z$$ is the same as $$X$$ this collapses to:

$$\hat {\theta_{2SOLS}} = (Z^TX)^{-1}Z^Ty$$

## More

### Identification through exogeneous variables

Previously our structural model was:

$$Q=\alpha_1 + \beta_1 P + \epsilon_1$$

$$Q=\alpha_2 + \beta_2 P + \epsilon_2$$

And our reduced form:

$$P =\dfrac{\alpha_2-\alpha_1 }{\beta_1-\beta_2}+\dfrac{\epsilon_2-\epsilon_1 }{\beta_1-\beta_2}$$

$$Q =\dfrac{\alpha_2\beta_1-\alpha_1\beta_2 }{\beta_1-\beta_2}+\dfrac{\beta_1\epsilon_2 -\beta_2\epsilon_1}{\beta_1-\beta_2}$$

Or:

$$P=\pi_1 + \tau_1$$

$$Q= \pi_2 + \tau_2$$

This time we add another measured variable, $$I$$.

$$Q=\alpha_1 + \beta_1 P + \theta_1 I + \epsilon_1$$

$$Q=\alpha_2 + \beta_2 P + \theta_2 I + \epsilon_2$$

The reduced form is now:

$$P =\dfrac{\alpha_2 -alpha_1 }{\beta_1-\beta_2}+\dfrac{\theta_2-\theta_1 }{\beta_1-\beta_2}I+\dfrac{\epsilon_2-\epsilon_1}{\beta_1-\beta_2}$$

$$Q =\dfrac{\alpha_2\beta_1-\alpha_1\beta_2 }{\beta_1-\beta_2}+\dfrac{\theta_2\beta_1-\theta_1\beta_2}{\beta_1-\beta_2}I+\dfrac{\beta_1\epsilon_2 -\beta_2\epsilon_1}{\beta_1-\beta_2}$$

Or:

$$P =\pi_{11} +\pi_{12}I + \tau_1$$

$$Q= \pi_{21} +\pi_{22}I + \tau_2$$

We can estimate $$\pi_1$$ and $$\pi_2$$ as $$\hat \pi_1$$ and $$\hat \pi_2$$ respectively.

We can now create estimators $$\hat \pi_{11}$$, $$\hat \pi_{12}$$, $$\hat \pi_{21}$$ and $$\hat \pi_{22}$$.

#### Identification with an exogeneous variable

We now have $$4$$ estimators and $$6$$ parameters, meaning that we still cannot identify the model.

#### Partial identification

Can we use $$\hat \pi$$ to identify any of the structural parameters?

We know that:

• $$\pi_{11} =\dfrac{\alpha_2 -\alpha_1 }{\beta_1-\beta_2}$$

• $$\pi_{12} =\dfrac{\theta_2-\theta_1}{\beta_1-\beta_2}$$

• $$\pi_{21} =\dfrac{\alpha_2\beta_1-\alpha_1\beta_2}{\beta_1-\beta_2}$$

• $$\pi_{22} =\dfrac{\theta_2\beta_1-\theta_1\beta_2}{\beta_1-\beta_2}$$

If the exogenous variable only affects one side of the equation, so $$\theta_1=0$$, we have:

• $$\pi_{11} =\dfrac{\alpha_2 -\alpha_1 }{\beta_1-\beta_2}$$

• $$\pi_{12} =\dfrac{\theta_2}{\beta_1-\beta_2}$$

• $$\pi_{21} =\dfrac{\alpha_2\beta_1-\alpha_1\beta_2}{\beta_1-\beta_2}$$

• $$\pi_{22} =\dfrac{\theta_2\beta_1}{\beta_1-\beta_2}$$

So we can see that:

$$\hat \beta_1 = \dfrac{\hat \pi_{22}}{\hat \pi_{12}}$$

This means we now have:

• $$\pi_{11} =\dfrac{\pi_{12}(\alpha_2 -\alpha_1 )}{\pi_{22}-\pi_{12}\beta_2}$$

• $$\pi_{12} =\dfrac{\pi_{12}\theta_2}{\pi_{22}-\pi_{12}\beta_2}$$

• $$\pi_{21} =\dfrac{\pi_{12}(\alpha_2\beta_1-\alpha_1\beta_2)}{\pi_{22}-\pi_{12}\beta_2}$$

• $$\pi_{22} =\dfrac{\pi_{12}\theta_2\beta_1}{\pi_{22}-\pi_{12}\beta_2}$$

We can use this to also identify $$\alpha_1$$.

#### Complete identification

If we have independent variables for each of the two equations, we can fully identify the model.

We will have $$6$$ estimators and $$6$$ parameters.

We are estimating:

$$Q=\alpha_1 + \beta_1 P + \theta_1 I + \epsilon_1$$

$$Q=\alpha_2 + \beta_2 P + \theta_2 J + \epsilon_2$$

$$I$$ and $$J$$ are essentially instrumental variables for the model.

$$I$$ is an instrumental variable for demand shocks, and $$J$$ is an instrumental variable for supply shocks.

## The Instrumental Variable (IV) estimator

### Instrumental Variable (IV) estimator

$$\hat {\theta_{IV}} = (Z^TX)^{-1}Z^Ty$$

2SOLS collpases to IV in some circumstances.

### Bias of the IV estimator

Equal to actual parameter so long as $$\epsilon$$ uncorrelated with $$Z$$.

### Variance of the IV estimator

$$\hat {\theta_{OLS}} = (X^TX)^{-1}X^Ty$$

$$Var [\hat {\theta_{OLS}}]=(X^TX)^{-1}X^T\Omega X(X^TX)^{-1}$$

With IV we have

$$\hat {\theta_{IV}} = (Z^TX)^{-1}Z^Ty$$

$$Var [\hat {\theta_{IV}}]=(Z^TX)^{-1}Z^T\Omega Z(Z^TX)^{-1}$$

We can use weighted least squares for $$\Omega$$.