Gamma-Gamma Spend Model

Jan 30, 2021 5 min read Probability, tutorial, PredictiveAnalytics

1 Model Assumption
2 Compute $E (Z ∣ \bar{z}, x)$
- 2.1 Derive related conditional density
- 2.2 Get the result
3 Understand the result
- 3.1 Key point
4 References

How to predict a customer’s mean spending in the future?

Answer: You can use gamma-gamma model.

1 Model Assumption

There are 3 general assumptions for this model:

The monetary value (e.g. $, ¥) of a customer’s given transaction varies randomly around their average transaction value.
Average transaction values vary across customers but do not vary over time for any given individual.
The distribution of average transaction values across customers is independent of the transaction process.

For a customer with $x$ transactions:

let ${z_{1}, z_{2}, \dots, z_{x}}$ denote the value of each transaction.
$z_{i}$ ’s are samples from the distribution of R.V. $Z$ .

The customer’s observed average transaction value is $\bar{z} = \sum_{i = 1}^{x} \frac{z_{i}}{x}$ .

However, $\bar{z}$ is an imperfect estimate of the unobserved mean transaction value.

Why? Consider when a customer only had very limited transactions, say 1 or 2 purchases, then it is questionable to use his average spending to estimate the spending power. At least we should use the population mean as the standard criteria to help. On the other hand, if the customer had enough purchase history, then we want to emphasize more on his own average spending while put relatively less weight on the population mean.

The goal is to make inference about

$E (Z | \bar{z}, x)$

What is distribution for $Z$ ?

Maybe log-normal or gamma, since spend data tend to be right skewed.

Here we assume the gamma distribution. More formally, we assume that:

$Z \sim g a m m a (p, ν)$
$ν \sim g a m m a (q, γ)$

From above setting, we know

$E (Z | p, ν) := ζ = \frac{p}{ν}$

$total spend = \sum_{i = 1}^{x} z_{i} \sim g a m m a (p x, ν)$

$average spend = \bar{z} = \sum_{i = 1}^{x} \frac{z_{i}}{x} \sim g a m m a (p x, ν x)$

They can be easily proved using MGF.

This results in what we call the gamma-gamma model of spend.

2 Compute $E (Z ∣ \bar{z}, x)$

We wish to make inferences about an individual customer’s mean spending given $\bar{z}$ , which we denote as $E (Z | \bar{z}, x)$ .

Note that,

$\begin{matrix} (2.1) & E (Z ∣ \bar{z}, x) = E (Z ∣ p, q, γ; \bar{z}, x) = \int_{0}^{\infty} E (Z ∣ p, ν) g (ν ∣ p, q, γ, \bar{z}, x) d ν \end{matrix}$

$Z \sim g a m m a (p, v) ⟹ E (Z ∣ p, ν) = \frac{p}{ν}$

2.2 Get the result

Now we have everything we need to derive (2.1). Before plug in, let’s review the inverse gamma distribution which will be used later.

If $X \sim G a m m a (α, β)$ then $Y := \frac{1}{x} \sim Inv-Gamma (α, β)$ , with

$E (Y) = \frac{β}{α - 1}$ , for $α > 1$

$V a r (Y) = \frac{β^{2}}{(α - 1)^{2} (α - 2)}$ , for $α > 2$ .

Plug into (2.1),

\begin{aligned} E (Z ∣ p, q, γ; \bar{z}, x) & = \int_{0}^{\infty} E (Z | p, ν) g (ν | p, q, γ, \bar{z}, x) d ν \\ = \int_{0}^{\infty} \frac{p}{ν} g (ν | p, q, γ, \bar{z}, x) d ν \\ = p \cdot E (\frac{1}{ν} | p, q, γ, \bar{z}, x), \\ note: \\ \frac{1}{ν} \sim Inv-Gamma (p x + q, γ + x \bar{z}) \\ E (Inv-Gamma) = \frac{scale}{shape - 1} \\ (2.6) & = \frac{p (γ + x \bar{z})}{p x + q - 1} \end{aligned}

3 Understand the result

How to understand $E (Z ∣ p, q, γ; \bar{z}, x)$ ? What can we say about the estimator of the individual’s future spend per transaction given his/her historical average spend $\bar{z}$ ?

First, let’s derive $E (Z ∣ p, q, γ)$ , which is the population mean.

$\begin{aligned} E (Z ∣ p, q, γ) & = \int_{0}^{\infty} E (Z ∣ p, ν) g (ν ∣ q, γ) d ν \\ = \int_{0}^{\infty} \frac{p}{ν} g (ν ∣ q, γ) d ν \\ = p E (\frac{1}{ν}), where \frac{1}{ν} \sim Inv-Gamma (q, γ) \\ = \frac{p γ}{q - 1} \end{aligned}$

Now let’s rearrange the result of (2.6),

\begin{aligned} E (Z ∣ p, q, γ; \bar{z}, x) & = \frac{p (γ + x \bar{z})}{p x + q - 1} \\ = (\frac{q - 1}{p x + q - 1}) \underset{population mean}{\underset{⏟}{\frac{p γ}{q - 1}}} + (\frac{p x}{p x + q - 1}) \underset{observed average}{\underset{⏟}{\bar{z}}} \end{aligned}

3.1 Key point

We note that this is the weighted average of the population mean, $E (Z | p, q, γ)$ , and the given individual’s observed average transaction value, $\bar{z}$ .

$E (Z ∣ \bar{z}, x) = w \cdot {population mean} + (1 - w) \cdot {observed average}$

As the number of observations $(x)$ used to compute $\bar{z}$ increases, less weight is placed on the population mean and more weight is placed on the customer’s observed average.

4 References

[1] Fader PS, Hardie BG (2013). “The Gamma-Gamma Model of Monetary Value.” URL http://www.brucehardie.com/notes/025/gamma_gamma.pdf.

[2] Colombo R, Jiang W (1999). “A stochastic RFM model.” Journal of Interactive Marketing, 13(3), 2-12.

Probability CLV gamma function gamma distribution inverse gamma prediction

Chen Xing

Founder & Data Scientist

Enjoy Life & Enjoy Work!