The Bernoulli process is the simplest non-trivial [[Stochastic Process]]. It is a infinite sequence of [[Independence and Identical Distribution|i.i.d.]] [[Bernoulli Distribution|Bernoulli]] trials $\{X_i\}$, each with a fixed probability of success $p$. **Contrast to Binomial:** - The [[Binomial Distribution]] also deals with repeated Bernoulli trials, but for a fixed $n$. We are interested in the number of successes from a Binomial [[Random Variable]]. - The Bernoulli Process is an ongoing sequence, and we are interested in patterns of that sequence (e.g. time to $k$-th success). **Related Distributions:** - *Binomial:* Counts the number of successes in first $n$ trials. - *Negative binomial:* Counts the number of failures needed to get $r$ successes. - *Geometric:* Counts the number of failures until the first success. This is a special case of the negative binomial where $r=1$. ## Fresh-Start Property If the Bernoulli Process $\{X_1, X_2, \dots\}$ is split at a causally determined point $n$, the subsequence starting at $X_{n+1}$ form a new Bernoulli process $\{Y_1, Y_2,\dots\}$ on its own. This is known as the fresh-start property. $ \begin{align} X&=X_1,X_2 \dots \\ Y&=X_{n+1},X_{n+2}, \dots \end{align} $ For this to be true $n$ must be chosen based only information up to $X_n$ (i.e. before $Y$ starts). The subsequence $Y$ inherits the independence of the original process, making $X \perp Y$. ## Time to k-th Success We know that the time until the first success is described by the [[Geometric Distribution]]. However, now we want to find $Y_k$, which is a r.v. describing the time until the $k$-th success. Due to the fresh-start property, we can split the whole Bernoulli process into $k$ independent processes $(T_1, T_2, \dots, T_k)$. Each $T_i$ represents the time until the next success starting after the previous success, and is by definition a geometric distribution $T_i \sim \mathrm{Geom}(p)$. $ Y_k=T_1+ \dots +T_k $ Using the properties of the Geometric distribution: $ \mathbb E[T_i]= \frac{1}{p}, \quad \mathrm{Var}(T_i)=\frac{1-p}{p^2} $ The [[Expectation]] and [[Variance]] of $Y_k$ are : $ \mathbb{E}[Y_k]=\frac{k}{p}, \quad \mathrm{Var}(Y_k)=\frac{k(1-p)}{p^2} $ The [[Probability Mass Function|PMF]] of $Y_k$ is defined as the intersection of the following two events: - There are $(k-1)$ successes over a $(t-1)$ trials. - There is a success at trial $t$. $ \begin{align} \mathbf P(Y_k=t) &= {t-1 \choose k-1}*p^{k-1}*(1-p)^{t-k}*p \\ &={t-1 \choose k-1}*p^k*(1-p)^{t-k} \end{align} $ ![[bernoulli-process-k-arrival.png|center|400]] ## Merged Bernoulli Process Given two independent Bernoulli processes $X_t = \mathrm{Ber}(p), \, Y_t=\mathrm{Ber}(q)$, the merged process $Z_t$ is defined as: $ Z_t = \begin{cases} 1 &\text{if } X_t=1 \text{ or } Y_t=1 \\ 0 &\text{otherwise} \end{cases} $ ![[bernoulli-process-merged.png|center|400]] The merged process $Z_t$ is also a Bernoulli process with success probability $(p+q-pq)$. $ Z_t = \begin{cases} 1 & (1-q)*p+(1-p)*q+pq \\ 0 &(1-p)*(1-q) \end{cases} $ The new process $Z_t$ is also a valid Bernoulli process, since it maintains the independence and homogeneity between all $Z_i$. This is because: - $X_t \perp Y_t$ - $X_t \perp X_{t+1}$ - $Y_t \perp Y_{t+1}$ ## Split Bernoulli Process **Original process:** A Bernoulli process $X_t$, with success probability $p$ can be split into two processes $Y_t$ and $Z_t$ based on a probability $q$. $ X_t=\begin{cases} 1 & p \\ 0 & \text{otherwise} \end{cases} $ **Split processes:** In case of a success in $X_t$, there is $q$ probability that we count it as success in $Y_t$ and $(1-q)$ probability we assign it to $Z_t$. $ \quad \quad Y_t = \begin{cases} 1 & p*q \\ 0 & \text{otherwise} \end{cases} \quad \quad Z_t = \begin{cases} 1 & p*(1-q) \\ 0 & \text{otherwise} \end{cases} $ ![[bernoulli-process-split.png|center|400]] Both $Y_t, Z_t$ are valid Bernoulli processes, which means that they are independent across time within themselves. This is true because: - The original process $X_t$ is independent across all $X_i$. - The probability of allocation $q$ is independent of $X_t$. However $Y_t, Z_t$ are not independent to each other, since a success in $Y_t$ ensures that there is no success in $Z_t$.