Independent and identically distributed random variables

In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is usually abbreviated as i. i. d. or iid. Herein, i. i. d. is used, because it is the most prevalent.

Introduction
In statistics, it is commonly assumed that observations in a sample are effectively i. i. d. The assumption (or requirement) that observations be i. i. d. tends to simplify the underlying mathematics of many statistical methods. In practical applications of statistical modeling, however, the assumption may or may not be realistic. To partially test how realistic the assumption is on a given data set, the autocorrelation can be computed, lag plots drawn or turning point test performed.

Often the i. i. d. assumption arises in the context of sequences of random variables. Then "independent and identically distributed" implies that an element in the sequence is independent of the random variables that came before it. An i. i. d. sequence does not imply that the probabilities for all elements of the sample space or event space must be the same. For example, repeated throws of loaded dice will produce a sequence that is i. i. d., despite the outcomes being biased.

Definition for two random variables
Suppose that the random variables $$X$$ and $$Y$$ are defined to assume values in $$I \subseteq \mathbb{R}$$. Let $$F_X(x) = \operatorname{P}(X\leq x)$$ and $$F_Y(y) = \operatorname{P}(Y\leq y)$$ be the cumulative distribution functions of $$X$$ and $$Y$$, respectively, and denote their joint cumulative distribution function by $$F_{X,Y}(x,y) = \operatorname{P}(X\leq x \land Y\leq y)$$.

Two random variables $$X$$ and $$Y$$ are identically distributed if and only if $$F_X(x)=F_Y(x) \, \forall x \in I$$.

Two random variables $$X$$ and $$Y$$ are independent if and only if $$F_{X,Y}(x,y) = F_{X}(x) \cdot F_{Y}(y) \, \forall x,y \in I$$. (See further .)

Two random variables $$X$$ and $$Y$$ are i. i. d. if they are independent and identically distributed, i.e. if and only if

$$ \begin{align} & F_X(x)=F_Y(x) \, & \forall x \in I \\ & F_{X,Y}(x,y) = F_{X}(x) \cdot F_{Y}(y) \, & \forall x,y \in I \end{align} $$

Definition for more than two random variables
The definition extends naturally to more than two random variables. We say that $$n$$ random variables $$X_1,\ldots,X_n$$ are i. i. d. if they are independent (see further Independence (probability theory)) and identically distributed, i.e. if and only if

$$ \begin{align} & F_{X_1}(x)=F_{X_k}(x) \, & \forall k \in \{1,\ldots,n \} \text{ and } \forall x \in I \\ & F_{X_1,\ldots,X_n}(x_1,\ldots,x_n) = F_{X_1}(x_1) \cdot \ldots \cdot F_{X_n}(x_n) \, & \forall x_1,\ldots,x_n \in I \end{align} $$

where $$F_{X_1,\ldots,X_n}(x_1,\ldots,x_n) = \operatorname{P}(X_1\leq x_1 \land \ldots \land X_n\leq x_n)$$ denotes the joint cumulative distribution function of $$X_1,\ldots,X_n$$.

Examples
The following are examples or applications of i. i. d. random variables:
 * A sequence of outcomes of spins of a fair or unfair roulette wheel is i. i. d. One implication of this is that if the roulette ball lands on "red", for example, 20 times in a row, the next spin is no more or less likely to be "black" than on any other spin.
 * A sequence of fair or loaded dice rolls is i. i. d.
 * A sequence of fair or unfair coin flips is i. i. d.
 * In signal processing and image processing the notion of transformation to i. i. d. implies two specifications, the "i.d." (i.d. = identically distributed) part and the "i." (i. = independent) part:
 * (i.d.) the signal level must be balanced on the time axis;
 * (i.) the signal spectrum must be flattened, i.e. transformed by filtering (such as deconvolution) to a white noise signal (i.e. a signal where all frequencies are equally present).