# Variables

## Variables

### Random variables

#### Defining variables

We have a sample space, $$\Omega$$. A random variable $$X$$ is a mapping from the sample space to the real numbers:

$$X: \Omega \rightarrow \mathbb{R}$$

We can then define the set of elements in $$\Omega$$. As an example, take a coin toss and a die roll. The sample space is:

$$\{H1,H2,H3,H4,H5,H6,T1,T2,T3,T4,T5,T6\}$$

A random variable could give us just the die value, such that:

$$X(H1)=X(T1)=1$$

We can define this more precisely using set-builder notation, by saying the following is defined for all $$c\in \mathbb{R}$$:

$$\{\omega |X(\omega )\le c\}$$

That is, for any number random variable map $$X$$, there is a corresponding subset of $$\Omega$$ containing the $$\omega$$s in $$\Omega$$ which map to less than $$c$$.

#### Multiple variables

Multiple variables can be defined on the sample space. If we rolled a die we could define variables for

• Whether it was odd/even

• Number on the die

• Whether it was less than 3

With more die we could add even more variables

#### Derivative variables

If we define a variable $$X$$, we can also define another variable $$Y=X^2$$.

### Probability mass functions

$$P(X=x)=P({\omega |X(\omega)=x})$$

For discrete probability, this is a helpful number. For example for rolling a die.

This is not helpful for continuous probability, where the chance of any specific outcome is $$0$$.

### Cumulative distribution functions

#### Definition

Random variables all valued as real numbers, and so we can write:

$$P(X\le x)=P({\omega |X(\omega)\le x})$$

Or:

$$F_X(x)=\int_{-\infty}^x f_X(u)du$$

$$F_X(x)=\sum_{x_i\le x}P(X=x_i)$$

#### Partitions

$$P(X\le x)+P(X\ge x)-P(X=x)=1$$

#### Interval

$$P(a< X\le b)=F_X(b)-F_X(a)$$

### Probability density functions

#### Definition

If continuous, probability at any point is $$0$$. We instead look at probability density.

Derived from cumulative distribution function:

$$F_X(x)=\int_{-\infty}^x f_X(u)du$$

The density function is $$f_X(x)$$.

#### Conditional probability distributions

For probability mass functions:

$$P(Y=y|X=x)=\dfrac{P(Y=y\land X=x)}{P(X=x)}$$

For probability density functions:

$$f_Y(y|X=x)=\dfrac{f_{X,Y}(x,y)}{f_X(x)}$$

## Multiple variables

### Joint and marginal probability

#### Joint probability

$$P(X=x\land Y=y)$$

#### Marginal probability

$$P(X=x)=\sum_{y}P(X=x\land Y=y)$$

$$P(X=x)=\sum_{y}P(X=x|Y=y)P(Y=y)$$

### Independence and conditional independence

#### Independence

$$x$$ is independent of $$y$$ if:

$$\forall x_i \in x,\forall y_j \in y (P(x_i|y_j)=P(x_i)$$

If $$P(x_i|y_j)=P(x_i)$$ then:

$$P(x_i\land y_j)=P(x_i).P(y_j)$$

This logic extends beyond just two events. If the events are independent then:

$$P(x_i\land y_j \land z_j)=P(x_i).P(y_j \land z_k)=P(x_i).P(y_j).P(z_k)$$

Note that because:

$$P(x_i|y_j)=\dfrac{P(x_i\land y_j)}{P(y_j)}$$

If two variables are independent

$$P(x_i|y_j)=\dfrac{P(x_i)P(y_j)}{P(y_j)}$$

$$P(x_i|y_j)=P(x_i)$$

#### Conditional independence

$$P(A\land B|X)=P(A|X)P(B|X)$$

This is the same as:

$$P(A|B\land X)=P(A|X)$$