Homogeneous treatment effects


Treatment data


With multilevel data with fixed coefficients we have:

\(y_{ij}=\mathbf x_{ij}\theta +m_j + \epsilon_{ij}\)

We can estimate \(m_j\) using fixed effects or similar methods.

Treatment data

If the data is grouped by whether an entity was treated then will have:

  • \(y_{i0}\) - the outcome if the entity was not treated

  • \(y_{i1}\) - the outcome if the entity was treated

However we only observe \(y_i\) and \(D_i\).


Average Treatment Effects (ATE, ATET, ATEUT)

Average Treatment Effect (ATE)


Average Treatment Effect on the Treated (ATET)



Average Treatment Effect on the Untreated (ATEUT)

Conditional Average Treatment Effect (CATE)

\(E[y_{i1}-y_{i0}|\mathbf x_i]\)

Exogenous treatment

Randomly Controlled Trials (RCTs)

If the model is:

\(y_i=D_i\theta +g(X) +\epsilon_i\)

And \(D\) is randomly assigned, then we can estimate

\(y_i=D_i\theta +\epsilon_i\)

To get an estimate for \(\theta \) without collecting data on \(X\).

Calculating CATEs in RCTs with interaction terms

Calculating CATEs in RCTs with subgroup analysis

Calculating treatment effects without estimating missing data


We can simply regress outcomes on variables, including treatment.

This assumes treatment effects are constant.

This also assumes that outcomes \(y_{1i}\) and \(y_{0i}\) are independent of \(D_i\), conditional on \(X\).

If we are missing variables in \(X\) then we will have biased estimates.

This also assumes the effects of \(X\) are linear.

We assume: \(E[y_{0i}|\mathbf x_{i}, D_i]=\mathbf x_i \theta\).

Instrumental Variables and natural experiments

Regression discontinuity

Synthetic controls

Calculating treatment effects by estimating missing data


Matching is similar to regression. We assume that effects are constant, and the effect of treatment on \(y_{0i}\) and \(y_{1i}\) are independent of treatment, once controlling for \(X\).

Again, this is biased if this is not the case.

We however do not have to assume a linear form for \(X\).

We assume: \(E[y_{ji}|\mathbf x_{i}, D_i]=E[y_{ji}|\mathbf x_{i}]\)

For each entity, find a near entity which had the opposite treatment.

Propensity score matching

Match on the chance of getting treatment, given covariates.

Matrix completion

\(E[y_{i1}-y_{i0}|\mathbf x_i]\)

Using semi-parametric


Estimating ATE using MCMC

Local Average Treatment Effect (LATE)

We have IVs for treatment.

Treatment effects

+ propsentiy score weighting + regression adjustemnt + matching + IV + Regression discontinuity

Meta analysis

big page in advanced analytics? Random effects meta analysis?

meta analysis: fixed effect v random effects model

types of study: + RCT + cohort studies + case-control studies + cross sectional studies

Dose response curve

Sensitivity analysis

Page on Rubin causal model