rearrange: + unsupervised after exploratory + generative after unsupervised + anova in exploratory
stats: Variable Inflation Factors. avoiding multiple colinariy
statistical modelling + rubin causal (problem: don’t observe counterfactual, try to compensate) + vs pearl directed acyclical graphs
data cleaning split out + shaping/joining + checking data (inc consistency)