Instrumental Variables and the General Method of Moments


Bias of OLS estimator from ommitted variables

Bias of OLS estimator from measurement error

Parameter estimation for simultaneous equations

Structural and reduced forms

Parameter identification problem with simultaneous equations

Identification terminology

A system is under-identified if there are not enough estimators for all structural parameters.

A system is exactly identified if there are the same number of estimators as structural parameters.

A system is over-identified if there are more estimators than structural parameters.

In general we have in our structural form:


This is a system with \(n\) endogeneous variables and \(m\) exogeneous variables.

We can write this in matrix form.

\(B\mathbf y =\Gamma \mathbf{x} + \mathbf{\epsilon}\)

We can use this to get:

\(\mathbf{y} =B^{-1}\Gamma \mathbf{x} + B^{-1}\mathbf{ \epsilon}\)

We estimate by placing restrictions on \(\Gamma\).

Strucutral models

If our data generating process is:

\(Q=\alpha + \beta P +\epsilon \)

We can estimate \(\alpha \)and \(\beta \) through measuring \(P\) and \(Q\).

If, however the data generating process involves simulataneous equations, we can have:

\(Q=\alpha_1 + \beta_1 P + \epsilon_1 \)

\(Q=\alpha_2 + \beta_2 P + \epsilon_2 \)

Reduced form

We can reduce this:

\(\alpha_1 + \beta_1 P + \epsilon_1 =\alpha_2 + \beta_2 P + \epsilon_2 \)

\((\alpha_1 -\alpha_2 )+ (\beta_1 -\beta_2 )P + (\epsilon_1 -\epsilon_2 )=0\)

\(P =\dfrac{\alpha_2-\alpha_1 }{\beta_1-\beta_2}+\dfrac{\epsilon_2-\epsilon_1 }{\beta_1-\beta_2}\)

We can rewrite this as:

\(P=\pi_1 + \tau_1 \)

Similarly we can reduce for \(Q\):

\(Q =\dfrac{\alpha_2\beta_1-\alpha_1\beta_2 }{\beta_1-\beta_2}+\dfrac{\beta_1\epsilon_2 -\beta_2\epsilon_1}{\beta_1-\beta_2}\)

\(Q= \pi_2 + \tau_2\)

We can’t directly estimate structural models

If \(P\) is correlated with \(epsilon_1\) or \(\epsilon_2\) then our estimates for \(\beta_1\) and \(\beta_2\) will be biased.

This also affects \(Q\).

From the reduced forms we can see that \(P\) will be correlated, due to simultaneity.

The identification problem

We can estimate \(\pi_1 \) and \(\pi_2\), but this does not allow us to identify any of the structural parameters.

We have \(2\) estimators, but \(4\) parameters.

This is the identification problem.

2 Stage OLS

2 Stage OLS (2SOLS) estimator


If \(x\) is correlated with the error term the OLS estimate will be biased.

2 Stage OLS - first stage

We have

\(y_i=x_i \theta + \epsilon_i \)

\(x_i=z_i \rho +\mu_i\)

We do OLS on the second to get \(\hat \rho \).

\(\hat \rho =(Z^TZ)^{-1}Z^TX\)

We use this to get predicted values of \(X\).

\(\hat X=Z\rho =Z(Z^TZ)^{-1}Z^TX = P_ZX\)

2 Stage OLS - second stage

We then regress \(y\) on the estimated \(X\):

\(y_i=\hat x_i\theta +\epsilon_i\)

Our prediction is then:

\(\hat {\theta_{2SOLS}} = (\hat {X^T}\hat X)^{-1}\hat {X^T}y)\)

\(\hat {\theta_{2SOLS}} = ((P_ZX)^TP_ZX)^{-1}(P_ZX)^Ty)\)

\(\hat {\theta_{2SOLS}} = (X^TP_ZX)^{-1}X^TP_Zy)\)

If the dimension of \(Z\) is the same as \(X\) this collapses to:

\(\hat {\theta_{2SOLS}} = (Z^TX)^{-1}Z^Ty\)

Bias of the 2SOLS estimator

Variance of the 2SOLS estimator


Identification through exogeneous variables

Previously our structural model was:

\(Q=\alpha_1 + \beta_1 P + \epsilon_1 \)

\(Q=\alpha_2 + \beta_2 P + \epsilon_2 \)

And our reduced form:

\(P =\dfrac{\alpha_2-\alpha_1 }{\beta_1-\beta_2}+\dfrac{\epsilon_2-\epsilon_1 }{\beta_1-\beta_2}\)

\(Q =\dfrac{\alpha_2\beta_1-\alpha_1\beta_2 }{\beta_1-\beta_2}+\dfrac{\beta_1\epsilon_2 -\beta_2\epsilon_1}{\beta_1-\beta_2}\)


\(P=\pi_1 + \tau_1 \)

\(Q= \pi_2 + \tau_2\)

Adding another variable

This time we add another measured variable, \(I\).

\(Q=\alpha_1 + \beta_1 P + \theta_1 I + \epsilon_1 \)

\(Q=\alpha_2 + \beta_2 P + \theta_2 I + \epsilon_2 \)

The reduced form is now:

\(P =\dfrac{\alpha_2 -alpha_1 }{\beta_1-\beta_2}+\dfrac{\theta_2-\theta_1 }{\beta_1-\beta_2}I+\dfrac{\epsilon_2-\epsilon_1}{\beta_1-\beta_2}\)

\(Q =\dfrac{\alpha_2\beta_1-\alpha_1\beta_2 }{\beta_1-\beta_2}+\dfrac{\theta_2\beta_1-\theta_1\beta_2}{\beta_1-\beta_2}I+\dfrac{\beta_1\epsilon_2 -\beta_2\epsilon_1}{\beta_1-\beta_2}\)


\(P =\pi_{11} +\pi_{12}I + \tau_1 \)

\(Q= \pi_{21} +\pi_{22}I + \tau_2 \)

We can estimate \(\pi_1 \) and \(\pi_2 \) as \(\hat \pi_1\) and \(\hat \pi_2\) respectively.

We can now create estimators \(\hat \pi_{11}\), \(\hat \pi_{12}\), \(\hat \pi_{21}\) and \(\hat \pi_{22}\).

Identification with an exogeneous variable

We now have \(4\) estimators and \(6\) parameters, meaning that we still cannot identify the model.

Partial identification

Can we use \(\hat \pi \) to identify any of the structural parameters?

We know that:

  • \(\pi_{11} =\dfrac{\alpha_2 -\alpha_1 }{\beta_1-\beta_2}\)

  • \(\pi_{12} =\dfrac{\theta_2-\theta_1}{\beta_1-\beta_2}\)

  • \(\pi_{21} =\dfrac{\alpha_2\beta_1-\alpha_1\beta_2}{\beta_1-\beta_2}\)

  • \(\pi_{22} =\dfrac{\theta_2\beta_1-\theta_1\beta_2}{\beta_1-\beta_2} \)

If the exogenous variable only affects one side of the equation, so \(\theta_1=0\), we have:

  • \(\pi_{11} =\dfrac{\alpha_2 -\alpha_1 }{\beta_1-\beta_2}\)

  • \(\pi_{12} =\dfrac{\theta_2}{\beta_1-\beta_2}\)

  • \(\pi_{21} =\dfrac{\alpha_2\beta_1-\alpha_1\beta_2}{\beta_1-\beta_2}\)

  • \(\pi_{22} =\dfrac{\theta_2\beta_1}{\beta_1-\beta_2} \)

So we can see that:

\(\hat \beta_1 = \dfrac{\hat \pi_{22}}{\hat \pi_{12}}\)

This means we now have:

  • \(\pi_{11} =\dfrac{\pi_{12}(\alpha_2 -\alpha_1 )}{\pi_{22}-\pi_{12}\beta_2}\)

  • \(\pi_{12} =\dfrac{\pi_{12}\theta_2}{\pi_{22}-\pi_{12}\beta_2}\)

  • \(\pi_{21} =\dfrac{\pi_{12}(\alpha_2\beta_1-\alpha_1\beta_2)}{\pi_{22}-\pi_{12}\beta_2}\)

  • \(\pi_{22} =\dfrac{\pi_{12}\theta_2\beta_1}{\pi_{22}-\pi_{12}\beta_2}\)

We can use this to also identify \(\alpha_1\).

Complete identification

If we have independent variables for each of the two equations, we can fully identify the model.

We will have \(6\) estimators and \(6\) parameters.

We are estimating:

\(Q=\alpha_1 + \beta_1 P + \theta_1 I + \epsilon_1 \)

\(Q=\alpha_2 + \beta_2 P + \theta_2 J + \epsilon_2 \)

\(I\) and \(J\) are essentially instrumental variables for the model.

\(I\) is an instrumental variable for demand shocks, and \(J\) is an instrumental variable for supply shocks.

Power of instruments

The Instrumental Variable (IV) estimator

Instrumental Variable (IV) estimator

\(\hat {\theta_{IV}} = (Z^TX)^{-1}Z^Ty\)

2SOLS collpases to IV in some circumstances.

Bias of the IV estimator

Equal to actual parameter so long as \(\epsilon \) uncorrelated with \(Z\).

Variance of the IV estimator

In OLS we had:

\(\hat {\theta_{OLS}} = (X^TX)^{-1}X^Ty\)

\(Var [\hat {\theta_{OLS}}]=(X^TX)^{-1}X^T\Omega X(X^TX)^{-1}\)

With IV we have

\(\hat {\theta_{IV}} = (Z^TX)^{-1}Z^Ty\)

\(Var [\hat {\theta_{IV}}]=(Z^TX)^{-1}Z^T\Omega Z(Z^TX)^{-1}\)

We can use weighted least squares for \(\Omega \).

Choosing instrumental variables

Double selection