Regression trees

In a classical regression tree, we follow a decison process as before, but the outcome is real number.

Within each leaf, all inputs are assigned that same number.

With a regression problem we cannot split nodes the same way as we did for classification.

Instead by split by the residual sum of squares.

In classical trees all items in a leaf are assigned the same values. In this model, all are given \(\theta\) for a parametric model.

This makes the resulting trees smoother.

We have some \(\hat y_i = f(\mathbf x_i, \theta ) + \epsilon\)

The approach generalises classic regression trees. There the estimate was \(\bar y\). Here it’s a regression.

At each node we do OLS. If the \(R^2\) of the model is less than some constant, we find a split which maximises the minimum of the two new \(R^2\).

Previously our decision tree classifier was binary.

We can instead adapt the mixed tree model and using a probit model at each leaf.