The Bernoulli process is the simplest non-trivial [[Stochastic Process]]. It is a infinite sequence of [[Independence and Identical Distribution|i.i.d.]] [[Bernoulli Distribution|Bernoulli]] trials $\{X_i\}$, each with a fixed probability of success $p$.
**Contrast to Binomial:**
- The [[Binomial Distribution]] also deals with repeated Bernoulli trials, but for a fixed $n$. We are interested in the number of successes from a Binomial [[Random Variable]].
- The Bernoulli Process is an ongoing sequence, and we are interested in patterns of that sequence (e.g. time to $k$-th success).
**Related Distributions:**
- *Binomial:* Counts the number of successes in first $n$ trials.
- *Negative binomial:* Counts the number of failures needed to get $r$ successes.
- *Geometric:* Counts the number of failures until the first success. This is a special case of the negative binomial where $r=1$.
## Fresh-Start Property
If the Bernoulli Process $\{X_1, X_2, \dots\}$ is split at a causally determined point $n$, the subsequence starting at $X_{n+1}$ form a new Bernoulli process $\{Y_1, Y_2,\dots\}$ on its own. This is known as the fresh-start property.
$ \begin{align}
X&=X_1,X_2 \dots \\ Y&=X_{n+1},X_{n+2}, \dots
\end{align} $
For this to be true $n$ must be chosen based only information up to $X_n$ (i.e. before $Y$ starts). The subsequence $Y$ inherits the independence of the original process, making $X \perp Y$.
## Time to k-th Success
We know that the time until the first success is described by the [[Geometric Distribution]]. However, now we want to find $Y_k$, which is a r.v. describing the time until the $k$-th success.
Due to the fresh-start property, we can split the whole Bernoulli process into $k$ independent processes $(T_1, T_2, \dots, T_k)$. Each $T_i$ represents the time until the next success starting after the previous success, and is by definition a geometric distribution $T_i \sim \mathrm{Geom}(p)$.
$ Y_k=T_1+ \dots +T_k $
Using the properties of the Geometric distribution:
$ \mathbb E[T_i]= \frac{1}{p}, \quad \mathrm{Var}(T_i)=\frac{1-p}{p^2} $
The [[Expectation]] and [[Variance]] of $Y_k$ are :
$ \mathbb{E}[Y_k]=\frac{k}{p}, \quad \mathrm{Var}(Y_k)=\frac{k(1-p)}{p^2} $
The [[Probability Mass Function|PMF]] of $Y_k$ is defined as the intersection of the following two events:
- There are $(k-1)$ successes over a $(t-1)$ trials.
- There is a success at trial $t$.
$
\begin{align}
\mathbf P(Y_k=t) &= {t-1 \choose k-1}*p^{k-1}*(1-p)^{t-k}*p \\
&={t-1 \choose k-1}*p^k*(1-p)^{t-k}
\end{align}
$
![[bernoulli-process-k-arrival.png|center|400]]
## Merged Bernoulli Process
Given two independent Bernoulli processes $X_t = \mathrm{Ber}(p), \, Y_t=\mathrm{Ber}(q)$, the merged process $Z_t$ is defined as:
$ Z_t = \begin{cases} 1 &\text{if } X_t=1 \text{ or } Y_t=1 \\ 0 &\text{otherwise} \end{cases} $
![[bernoulli-process-merged.png|center|400]]
The merged process $Z_t$ is also a Bernoulli process with success probability $(p+q-pq)$.
$ Z_t =
\begin{cases} 1 & (1-q)*p+(1-p)*q+pq \\ 0 &(1-p)*(1-q)
\end{cases} $
The new process $Z_t$ is also a valid Bernoulli process, since it maintains the independence and homogeneity between all $Z_i$. This is because:
- $X_t \perp Y_t$
- $X_t \perp X_{t+1}$
- $Y_t \perp Y_{t+1}$
## Split Bernoulli Process
**Original process:** A Bernoulli process $X_t$, with success probability $p$ can be split into two processes $Y_t$ and $Z_t$ based on a probability $q$.
$ X_t=\begin{cases} 1 & p \\ 0 & \text{otherwise} \end{cases} $
**Split processes:** In case of a success in $X_t$, there is $q$ probability that we count it as success in $Y_t$ and $(1-q)$ probability we assign it to $Z_t$.
$
\quad \quad Y_t = \begin{cases} 1 & p*q \\ 0 & \text{otherwise} \end{cases}
\quad \quad Z_t = \begin{cases} 1 & p*(1-q) \\ 0 & \text{otherwise} \end{cases}
$
![[bernoulli-process-split.png|center|400]]
Both $Y_t, Z_t$ are valid Bernoulli processes, which means that they are independent across time within themselves. This is true because:
- The original process $X_t$ is independent across all $X_i$.
- The probability of allocation $q$ is independent of $X_t$.
However $Y_t, Z_t$ are not independent to each other, since a success in $Y_t$ ensures that there is no success in $Z_t$.