Negative hypergeometric distribution

Negative hypergeometric
Probability mass function
Cumulative distribution function
Parameters	$N\in \left\{0,1,2,\dots \right\}$ - total number of elements $K\in \left\{0,1,2,\dots ,N\right\}$ - total number of 'success' elements $r\in \left\{0,1,2,\dots ,N-K\right\}$ - number of failures when experiment is stopped
Support	$k\in \left\{0,\,\dots ,\,K\right\}$ - number of successes when experiment is stopped.
PMF	${\frac {{{k+r-1} \choose {k}}{{N-r-k} \choose {K-k}}}{N \choose K}}$
Mean	$r{\frac {K}{N-K+1}}$
Variance	$r{\frac {(N+1)K}{(N-K+1)(N-K+2)}}[1-{\frac {r}{N-K+1}}]$

In probability theory and statistics, the negative hypergeometric distribution describes probabilities for when sampling from a finite population without replacement in which each sample can be classified into two mutually exclusive categories like Pass/Fail or Employed/Unemployed. As random selections are made from the population, each subsequent draw decreases the population causing the probability of success to change with each draw. Unlike the standard hypergeometric distribution, which describes the number of successes in a fixed sample size, in the negative hypergeometric distribution, samples are drawn until $r$ failures have been found, and the distribution describes the probability of finding $k$ successes in such a sample. In other words, the negative hypergeometric distribution describes the likelihood of $k$ successes in a sample with exactly $r$ failures.

Definition

There are $N$ elements, of which $K$ are defined as "successes" and the rest are "failures".

Elements are drawn one after the other, without replacements, until $r$ failures are encountered. Then, the drawing stops and the number $k$ of successes is counted. The negative hypergeometric distribution, $NHG_{N,K,r}(k)$ is the discrete distribution of this $k$ .

^[1]

The negative hypergeometric distribution is a special case of the beta-binomial distribution^[2] with parameters $\alpha =r$ and $\beta =N-K-r+1$ both being integers (and $n=K$ ).

The outcome requires that we observe $k$ successes in $(k+r-1)$ draws and the $(k+r){\text{-th}}$ bit must be a failure. The probability of the former can be found by the direct application of the hypergeometric distribution $(HG_{N,K,k+r-1}(k))$ and the probability of the latter is simply the number of failures remaining $(=N-K-(r-1))$ divided by the size of the remaining population $(=N-(k+r-1)$ . The probability of having exactly $k$ successes up to the $r{\text{-th}}$ failure (i.e. the drawing stops as soon as the sample includes the predefined number of $r$ failures) is then the product of these two probabilities:

{\frac {{\binom {K}{k}}{\binom {N-K}{k+r-1-k}}}{\binom {N}{k+r-1}}}\cdot {\frac {N-K-(r-1)}{N-(k+r-1)}}={\frac {{{k+r-1} \choose {k}}{{N-r-k} \choose {K-k}}}{N \choose K}}.

Therefore, a random variable $X$ follows the negative hypergeometric distribution if its probability mass function (pmf) is given by

f(k;N,K,r)\equiv \Pr(X=k)={\frac {{{k+r-1} \choose {k}}{{N-r-k} \choose {K-k}}}{N \choose K}}\quad {\text{for }}k=0,1,2,\dotsc ,K

where

$N$ is the population size,
$K$ is the number of success states in the population,
$r$ is the number of failures,
$k$ is the number of observed successes,
$a \choose b$ is a binomial coefficient

By design the probabilities sum up to 1. However, in case we want show it explicitly we have:

\sum _{k=0}^{K}\Pr(X=k)=\sum _{k=0}^{K}{\frac {{{k+r-1} \choose {k}}{{N-r-k} \choose {K-k}}}{N \choose K}}={\frac {1}{N \choose K}}\sum _{k=0}^{K}{{k+r-1} \choose {k}}{{N-r-k} \choose {K-k}}={\frac {1}{N \choose K}}{N \choose K}=1,

where we have used that,

{\begin{aligned}\sum _{j=0}^{k}{\binom {j+m}{j}}{\binom {n-m-j}{k-j}}&=\sum _{j=0}^{k}(-1)^{j}{\binom {-m-1}{j}}(-1)^{k-j}{\binom {m+1+k-n-2}{k-j}}\\&=(-1)^{k}\sum _{j=0}^{k}{\binom {-m-1}{j}}{\binom {k-n-2-(-m-1)}{k-j}}\\&=(-1)^{k}{\binom {k-n-2}{k}}\\&=(-1)^{k}{\binom {k-(n+1)-1}{k}}\\&={\binom {n+1}{k}},\end{aligned}}

which can be derived using the binomial identity,

{{n \choose k}=(-1)^{k}{k-n-1 \choose k}},

and the Chu–Vandermonde identity,

\sum _{j=0}^{k}{\binom {m}{j}}{\binom {n-m}{k-j}}={\binom {n}{k}},

which holds for any complex-values $m$ and $n$ and any non-negative integer $k$ .

Expectation

When counting the number $k$ of successes before $r$ failures, the expected number of successes is ${\frac {rK}{N-K+1}}$ and can be derived as follows.

${\begin{aligned}E[X]&=\sum _{k=0}^{K}k\Pr(X=k)=\sum _{k=0}^{K}k{\frac {{{k+r-1} \choose {k}}{{N-r-k} \choose {K-k}}}{N \choose K}}={\frac {r}{N \choose K}}\left[\sum _{k=0}^{K}{\frac {(k+r)}{r}}{{k+r-1} \choose {r-1}}{{N-r-k} \choose {K-k}}\right]-r\\&={\frac {r}{N \choose K}}\left[\sum _{k=0}^{K}{{k+r} \choose {r}}{{N-r-k} \choose {K-k}}\right]-r={\frac {r}{N \choose K}}\left[\sum _{k=0}^{K}{{k+r} \choose {k}}{{N-r-k} \choose {K-k}}\right]-r\\&={\frac {r}{N \choose K}}\left[{{N+1} \choose K}\right]-r={\frac {rK}{N-K+1}},\end{aligned}}$

where we have used the relationship $\sum _{j=0}^{k}{\binom {j+m}{j}}{\binom {n-m-j}{k-j}}={\binom {n+1}{k}}$ , that we derived above to show that the negative hypergeometric distribution was properly normalized.

Variance

The variance can be derived by the following calculation.

${\begin{aligned}E[X^{2}]&=\sum _{k=0}^{K}k^{2}\Pr(X=k)=\left[\sum _{k=0}^{K}(k+r)(k+r+1)\Pr(X=k)\right]-(2r+1)E[X]-r^{2}-r\\&={\frac {r(r+1)}{N \choose K}}\left[\sum _{k=0}^{K}{{k+r+1} \choose {r+1}}{{N+1-(r+1)-k} \choose {K-k}}\right]-(2r+1)E[X]-r^{2}-r\\&={\frac {r(r+1)}{N \choose K}}\left[{{N+2} \choose K}\right]-(2r+1)E[X]-r^{2}-r={\frac {rK(N-r+Kr+1)}{(N-K+1)(N-K+2)}}\end{aligned}}$

Then the variance is ${\textrm {Var}}[X]=E[X^{2}]-\left(E[X]\right)^{2}={\frac {rK(N+1)(N-K-r+1)}{(N-K+1)^{2}(N-K+2)}}$

Related distributions

If the drawing stops after a constant number $n$ of draws (regardless of the number of failures), then the number of successes has the hypergeometric distribution, $HG_{N,K,n}(k)$ . The two functions are related in the following way:^[1]

NHG_{N,K,r}(k)=1-HG_{N,N-K,k+r}(r-1)

Negative-hypergeometric distribution (like the hypergeometric distribution) deals with draws without replacement, so that the probability of success is different in each draw. In contrast, negative-binomial distribution (like the binomial distribution) deals with draws with replacement, so that the probability of success is the same and the trials are independent. The following table summarizes the four distributions related to drawing items:

	With replacements	No replacements
# of successes in constant # of draws	binomial distribution	hypergeometric distribution
# of successes in constant # of failures	negative binomial distribution	negative hypergeometric distribution

Some authors^[3]^[4] define the negative hypergeometric distribution to be the number of draws required to get the $r$ th failure. If we let $Y$ denote this number then it is clear that $Y=X+r$ where $X$ is as defined above. Hence the PMF

\Pr(Y=y)={\binom {y-1}{r-1}}{\frac {\binom {N-y}{N-K-r}}{\binom {N}{N-K}}}.

If we let the number of failures $N-K$ be denoted by $M$ means that we have

\Pr(Y=y)={\binom {y-1}{r-1}}{\frac {\binom {N-y}{M-r}}{\binom {N}{M}}}.

The support of $Y$ is the set $\{r,r+1,\dots ,N-M+r\}$ . It is clear that:

E[Y]=E[X]+r={\frac {r(N+1)}{M+1}}

and ${\textrm {Var}}[X]={\textrm {Var}}[Y]$ .

References

^ ^a ^b Negative hypergeometric distribution in Encyclopedia of Math.
^ Johnson, Norman L.; Kemp, Adrienne W.; Kotz, Samuel (2005). Univariate Discrete Distributions. Wiley. ISBN 0-471-27246-9. §6.2.2 (p.253–254)
^ Rohatgi, Vijay K., and AK Md Ehsanes Saleh. An introduction to probability and statistics. John Wiley & Sons, 2015.
^ Khan, RA (1994). A note on the generating function of a negative hypergeometric distribution. Sankhya: The Indian Journal of Statistics B, 56(3), 309-313.

Probability distributions (list)

Discrete
univariate

with finite support	Benford Bernoulli beta-binomial binomial categorical hypergeometric negative Poisson binomial Rademacher soliton discrete uniform Zipf Zipf–Mandelbrot
with infinite support	beta negative binomial Borel Conway–Maxwell–Poisson discrete phase-type Delaporte extended negative binomial Flory–Schulz Gauss–Kuzmin geometric logarithmic mixed Poisson negative binomial Panjer parabolic fractal Poisson Skellam Yule–Simon zeta

Continuous
univariate

supported on a bounded interval	arcsine ARGUS Balding–Nichols Bates beta beta rectangular continuous Bernoulli Irwin–Hall Kumaraswamy logit-normal noncentral beta PERT raised cosine reciprocal triangular U-quadratic uniform Wigner semicircle
supported on a semi-infinite interval	Benini Benktander 1st kind Benktander 2nd kind beta prime Burr chi chi-squared noncentral inverse scaled Dagum Davis Erlang hyper exponential hyperexponential hypoexponential logarithmic F noncentral folded normal Fréchet gamma generalized inverse gamma/Gompertz Gompertz shifted half-logistic half-normal Hotelling's T-squared inverse Gaussian generalized Kolmogorov Lévy log-Cauchy log-Laplace log-logistic log-normal log-t Lomax matrix-exponential Maxwell–Boltzmann Maxwell–Jüttner Mittag-Leffler Nakagami Pareto phase-type Poly-Weibull Rayleigh relativistic Breit–Wigner Rice truncated normal type-2 Gumbel Weibull discrete Wilks's lambda
supported on the whole real line	Cauchy exponential power Fisher's z Kaniadakis κ-Gaussian Gaussian q generalized normal generalized hyperbolic geometric stable Gumbel Holtsmark hyperbolic secant Johnson's S_U Landau Laplace asymmetric logistic noncentral t normal (Gaussian) normal-inverse Gaussian skew normal slash stable Student's t Tracy–Widom variance-gamma Voigt
with support whose type varies	generalized chi-squared generalized extreme value generalized Pareto Marchenko–Pastur Kaniadakis κ-exponential Kaniadakis κ-Gamma Kaniadakis κ-Weibull Kaniadakis κ-Logistic Kaniadakis κ-Erlang q-exponential q-Gaussian q-Weibull shifted log-logistic Tukey lambda