# Dimensionality reduction

## Dimensionality reduction

### Classical principal component analysis

#### Introduction

Principal component analysis takes a dataset \(X\) with \(m\) variables and returns a principal component matrix \(A\) with size \(m\times k\).

Each new dimension is a linear function of the existing data. \(Z=XA\).

Each dimension in uncorrelated, and ordered, in order of descending explanation of variability.

The problem of principal component analysis is to find these weightings \(A\).

#### Classical PCA

We take the first \(k\) eigenvectors of the covariance matrix, ordered by eigenvalue.

#### Getting the eigenvectors using SVD

We can decompose \(X=U\Sigma A^T\).

We can take the eigenvectors from \(A\).

#### Choosing the number of dimension

We can choose \(k\) such that a certain percentage of the variance is retained.

### Robust principal component analysis

#### Robust PCA

Robust PCA can be used to deal with corrupted data, such as corrupted image data.

Rather than data \(X\) we have \(M=L_0+S_0\) where \(L_0\) is what we want to recover (and is low rank), and \(S_0\) is noise (and sparce).

In video footage, \(L_0\) can correspond to the background, while \(S_0\) corresponds to movement.