# Bayesian trees

## Bayesian trees

### Priors of trees

#### Priors for simple trees

We can define a tree as a set of nodes: $$T$$.

For each node we define a splitting variable $$k$$ and a splitting threshold $$r$$.

Our prior is $$P(T, k, r)$$.

We split this up to:

$$P(T, k, r)=P(T)P(k, r|T)$$

$$P(T, k, r)=P(T)P(k|T)P(r|T, k)$$

So we want to estimate:

• $$P(T)$$ - The number of nodes.

• $$P(k|T)$$ - Which variables we split by, given the tree size.

• $$P(r|T, k)$$ - The cutoff, given the tree size and the variables we are splitting by.

#### Priors for mixed trees

If at the leaf we have a parametric model, our prior is instead:

$$P(T, k, r, \theta )=P(T)P(k|T)P(r|T, k)P(\theta | T, k, r)$$

We then need to additionally estimate $$P(\theta | T, k, r)$$.

### The pinball prior

We can generate a tree with a fixed number of leaves, according to our prior.

As we start the tree we assocaite the root node with a count of all leaves.

As we split a node, the remaining leaf counts are divided between the directions. If there is only one leaf left, we do no further splitting.

### Bayesian CART

#### Our prior

Call the collective parameters of the tree $$\Theta =(T, k, r)$$ and $$\theta$$.

Collectively our prior is defined by $$P(\Theta )$$ and $$P(\theta )$$

#### Bayesâ€™ theorem

We want to know the posterior given our data $$X$$.

$$P(\Theta | X)=\dfrac{P(X|\Theta )P(\Theta )}{P(X)}$$

$$P(\Theta | X)\propto P(X|\Theta )P(\Theta )$$

#### Expanded posterior

We know explore $$P(X|\Theta )$$

$$P(X|\Theta )=\int P(X| \theta , \Theta)P(\theta)d\theta$$

This means our posterior is:

$$P(\Theta | X)\propto P(\Theta )\int P(X| \theta , \Theta)P(\theta)d\theta$$

#### Estimation

This can be estimated with MCMC.