The Generalised Method of Moments (GMM)

Generalised Method of Moments (GMM)

Difference from Method Of Moments (MOM)

More conditions than data.

Generalised Method of Moments (GMM)

We have a function on the output and a parameter:

\(g(y, \theta )\)

A moment condition is that the expectation of such a function is \(0\).

\(m(\theta )=E[g(y, \theta )]=0\)

To do GMM, we estimate this using:

\(\hat m(\theta )=\dfrac{1}{n}\sum_ig(y_i, \theta )\)

We define:

\(\Omega = E[g(y, \theta )g(y, \theta)^T]\)

\(G=E[\Delta_\theta g(y, \theta)]\)

And then minimise the norm:

\(||\hat m(\theta )||^2_W=\hat m(\theta )^TW\hat m(\theta )\)

Where \(W\) is a positive definite matrix for the norm.

\(\Omega ^{-1}\) is most efficient. But we don’t know this. It depends on \(\theta \).

We can estimate it if IID:

\(\hat W(\hat \theta )= (\dfrac{1}{n}\sum_i g(y, \hat \theta)g(y, \hat \theta)^T)^{-1}\)

Two-step feasible GMM

Estimate using \(\mathbf W=\mathbf I\)

Consistent, but not efficient.

Moment conditions








\(E[\Delta_\theta \ln f(x, \theta)]=0\)


\(m(\theta_0)=E[g(\mathbf x_i, \theta_0]\)

We replace this with sample moment

\(\hat m(\theta)=\frac{1}{n}\sum_ig(\mathbf x_i, \theta)\)

We have the “score”

\(\nabla_\theta g(\mathbf x_i, \theta_0)\)


\(G=E[\nabla_\theta g(\mathbf x_i, \theta_0)]\)

Variance-covariance loss matrix

\(\Omega =E[g(\mathbf x_i, \theta_0)g(\mathbf x_i, \theta_0)^T]\)

We want to minimise moment loss

\(||\hat m(\theta)||^2_W=\hat m(\theta )^TW\hat m(\theta)\)

\(\hat \theta = argmin_\theta (\frac{1}{n}\sum_ig(\mathbf x_i, \theta))^T\hat W(\frac{1}{n}\sum_ig(\mathbf x_i, \theta))\)


CLT means normal.

They are consistent IF moment condition is true.

There is an explicit formula for variance.

\(\sqrt n (\hat \theta -\theta_0)\rightarrow^d N[0, (G^TWG)^{-1}G^TW\Omega W^TG(G^TW^TG)^{-1}]\)

If we choose \(W\propto \Omega^{-1}\) then:

\(\sqrt n (\hat \theta -\theta_0)\rightarrow^d N[0, (G^T\Omega^{-1} G)^{-1}]\)

Problem: we need to estimate \(\Omega \) and \(G\).

\(\Omega \): estimate from sample. allows us to choose estimator, but still leaves variance unidentified.

Do the above from OLS? This is where robust etc stuff comes from

If it is specified. Moment conditions are equal to the number of moments, then \(W\) doesn’t matter. This is normal Method of Moments.

Estimating the weighting matrix

Iterated GMM

Moment-covariance matrix

Bias and variance of the GMM estimator

page on Bias and variance of the GMM estimator (cluster assumption should be part of moment condition?) part of later calculation of weighting?

Can do robust, hac, clustering as part of GMM too.