This is a method for both building and training.
We start with a bare bones network. We then add nodes one by one, training and then fixing their values.
This is an alternative to backprobagation for training a feedforward neural network.
We start with random parameters for each layer \(W_i\).
\(\hat y=W_2\sigma (W_1 x)\)
So \(W_1\) is random and not updated.
\(W_2\) is assigned to minimise loss, where \(W_2\) has no activation function.