Theory Behind Pareto/NBD Part 1

Jan 31, 2021 4 min read PredictiveAnalytics, Probability, tutorial

1 Introduction
2 PNBD Model Assumptions
3 Why named Pareto/NBD?
- 3.1 Poisson Gamma Mixture
- 3.2 Exponential Gamma Mixture
4 Reference

1 Introduction

The Pareto/NBD model was developed by Schmittlein et al. (1987) to describe repeat-buying behavior in a noncontractual setting.

There are 4 key questions:

How many “alive” customers does the firm now have?
How has this customer base grown over the past year?
Which individuals on this list most likely represent active customers? Inactive customers?
What level of transactions should be expected next year by those on the list, both individually and collectively?

In order to answer these questions, we need to build up the model(s) to estimate:

What is the $P (a l i v e | her trans infor)$ ?
What is the $E (# of trans | her trans infor)$ ?

2 PNBD Model Assumptions

2.1 Two stages in the lifetime

Customers go through 2 stages in their “lifetime”: they are “alive” for some period of time, then become permanently inactive.

2.2 Poisson Purchase

Given a customer while alive, the number of transactions follows Poisson distribution with parameter $λ$ , called transaction rate. The probability of observing $x$ transactions in the time interval $(0, t]$ is given by:

$P (X (t) = x | λ) = e^{- λ t} \frac{(λ t)^{x}}{x!}, x = 0, 1, 2, . . .$

This is equivalent to assuming that the time between transactions is $E x p (λ)$ ,

$f (t_{j} - t_{j - 1} | λ) = λ e^{(t_{j} - t_{j - 1})}, t_{j} > t_{j - 1} > 0,$

where $t_{j}$ is the time of the jth purchase.

2.3 Exponential Lifetime

A customer’s unobserved “lifetime” of length $τ$ , $τ \sim E x p (μ), f (τ | μ) = μ e^{- μ τ},$ where $μ$ is called dropout rate.

2.4 Gamma transaction rate

Heterogeneity in transaction rates across customers follows a gamma distribution with shape parameter $r$ and scale parameter $α$ :

$λ \sim G a m m a (r, α), g (λ | r, α) = \frac{α^{r} λ^{r - 1} e^{- λ α}}{Γ (r)} .$

2.5 Gamma dropout rate

Heterogeneity in dropout rates across customers follows a gamma distribution with shape parameter $s$ and scale parameter $β$ :

$μ \sim G a m m a (s, β), g (μ | s, β) = \frac{β^{s} μ^{s - 1} e^{- μ β}}{Γ (s)} .$

2.6 Two processes are Independent

The transaction rate $λ$ and the dropout rate $μ$ vary independently across customers,

$λ ⊥ μ .$

3 Why named Pareto/NBD?

Short answer:

$Poisson Purchase + Gamma transaction rate ⟹ NegBin$

$Exponential lifetime + Gamma dropout rate ⟹ Pareto$

3.1 Poisson Gamma Mixture

Theorem 3.1 If we assume the Poisson purchase and the Gamma transaction rate, then the distribution of the number of transactions while the customer is alive is Negative Binomial (NBD).

Proof. $\begin{aligned} P (X (t) = x | r, α) & = \int_{0}^{\infty} P (X (t) = x | λ) g (λ | r, α) d λ \\ = \int_{0}^{\infty} e^{- λ t} \frac{(λ t)^{x}}{x!} \frac{λ^{r - 1} α^{r} e^{- λ α}}{Γ (r)} d λ \\ = \frac{α^{r}}{Γ (r)} \frac{t^{x}}{x!} \int_{0}^{\infty} λ^{x + r - 1} e^{- λ (t + α)} d λ, let u = (t + α) λ, \\ = \frac{α^{r}}{Γ (r)} \frac{t^{x}}{x!} \frac{1}{(t + α)^{x + r}} \int_{0}^{\infty} u^{x + r - 1} e^{- u} d u, note the form of Γ (.), \\ = \frac{α^{r}}{Γ (r)} \frac{t^{x}}{x!} \frac{1}{(t + α)^{x + r}} Γ (x + r) \\ = \frac{Γ (r + x)}{Γ (r) x!} {(\frac{α}{α + t})}^{r} {(\frac{t}{α + t})}^{x} \end{aligned}$

Note that the last line is the density of negative binomial. It looks a little bit different from our familiar version of NegBin, and the parameter $r$ extends to the $R^{+}$ . In this case, it is called Polya distribution which is a special case of negative binomial.

3.2 Exponential Gamma Mixture

Theorem 3.2 If we assume the Exponential lifetime and the Gamma dropout rate, then the distribution of the “lifetime” is “Pareto distribution of the second kind”.

Proof. $\begin{aligned} f (τ | s, β) & = \int_{0}^{\infty} f (τ | μ) g (μ | s, β) d μ \\ = \int_{0}^{\infty} μ e^{- μ τ} \frac{β e^{- β μ} (β μ)^{s - 1}}{Γ (s)} d μ \\ = \frac{β^{s}}{Γ (s)} \int_{0}^{\infty} μ^{s} e^{- μ (τ + β)} d μ, let u = μ (τ + β) \\ = \frac{β^{s}}{Γ (s)} \int_{0}^{\infty} \frac{1}{(τ + β)^{s}} u^{s} e^{- u} \frac{1}{τ + β} d u \\ = \frac{β^{s}}{Γ (s)} \frac{1}{(τ + β)^{s + 1}} \int_{0}^{\infty} u^{s} e^{- u} d u \\ = \frac{β^{s}}{Γ (s)} \frac{1}{(τ + β)^{s + 1}} Γ (s + 1) \\ = \frac{β^{s}}{Γ (s)} \frac{1}{(τ + β)^{s + 1}} s Γ (s) \\ = \frac{s}{β} {(\frac{β}{τ + β})}^{s + 1} \end{aligned}$

Note that

$\begin{aligned} F (τ | s, β) & = \int_{0}^{\infty} F (τ | μ) g (μ | s, β) d μ \\ = 1 - {(\frac{β}{β + τ})}^{s} \end{aligned}$

Therefore, if we assume Exponential lifetime and Gamma dropout rate, we have ended with Pareto Type II distribution, or more specifically, Lomax distribution.

In conclusion, the NBD and Pareto labels for each of the sub-models naturally leads to the name of the integrated model.

In the next post we will talk about the likelihood, the mean of the Pareto/NBD model, and other related derivations, eg. probability of the customer being alive.

4 Reference

Schmittlein DC, Morrison DG, Colombo R (1987). “Counting Your Customers: Who-Are They and What Will They Do Next?” Management Science, 33(1), 1-24.
Fader PS, Hardie BGS (2005). “A Note on Deriving the Pareto/NBD Model and Related Expressions.” URL
Fader PS, Hardie BGS (2007). “Incorporating time-invariant covariates into the Pareto/NBD and BG/NBD models.” URL.
Fader PS, Hardie BGS (2020). “Deriving an Expression for P(X(t)=x) Under the Pareto/NBD Model.” URL

CLV gamma distribution gamma gamma model negative binomial pareto distribution PNBD prediction

Chen Xing

Founder & Data Scientist

Enjoy Life & Enjoy Work!