Gamma-Gamma Spend Model

How to predict a customer’s mean spending in the future?

Answer: You can use gamma-gamma model.

1 Model Assumption

There are 3 general assumptions for this model:

  1. The monetary value (e.g. $, ¥) of a customer’s given transaction varies randomly around their average transaction value.

  2. Average transaction values vary across customers but do not vary over time for any given individual.

  3. The distribution of average transaction values across customers is independent of the transaction process.

For a customer with \(x\) transactions:

  • let \(\{z_1, z_2, \cdots, z_x\}\) denote the value of each transaction.

  • \(z_i\)’s are samples from the distribution of R.V. \(Z\).

The customer’s observed average transaction value is \(\bar{z} = \sum_{i = 1}^{x}\frac{z_i}{x}\).

However, \(\bar{z}\) is an imperfect estimate of the unobserved mean transaction value.

Why? Consider when a customer only had very limited transactions, say 1 or 2 purchases, then it is questionable to use his average spending to estimate the spending power. At least we should use the population mean as the standard criteria to help. On the other hand, if the customer had enough purchase history, then we want to emphasize more on his own average spending while put relatively less weight on the population mean.

The goal is to make inference about

\[ E(Z| \bar{z}, x) \]

What is distribution for \(Z\)?

Maybe log-normal or gamma, since spend data tend to be right skewed.

Here we assume the gamma distribution. More formally, we assume that:

  1. \(Z \sim gamma(p, \nu)\)

  2. \(\nu \sim gamma(q, \gamma)\)

From above setting, we know

\[ E(Z|p, \nu) := \zeta = \frac{p}{\nu} \]

\[ \text{total spend} = \sum_{i = 1}^{x}z_i \sim gamma(px, \nu) \]

\[ \text{average spend} = \bar{z} = \sum_{i = 1}^{x}\frac{z_i}{x} \sim gamma(px, \nu x) \]

They can be easily proved using MGF.

This results in what we call the gamma-gamma model of spend.

2 Compute \(E(Z \mid \bar{z}, x)\)

We wish to make inferences about an individual customer’s mean spending given \(\bar{z}\), which we denote as \(E(Z|\bar{z}, x)\).

Note that,

\[ \begin{equation} E(Z \mid \bar{z}, x) = E(Z \mid p, q, \gamma ; \bar{z}, x) =\int_{0}^{\infty }E(Z\mid p, \nu)g(\nu \mid p, q, \gamma, \bar{z},x )d\nu \tag{2.1} \end{equation} \]

\[ Z \sim gamma(p, v) \implies E(Z \mid p, \nu) = \frac{p}{\nu} \]

2.2 Get the result

Now we have everything we need to derive (2.1). Before plug in, let’s review the inverse gamma distribution which will be used later.

If \(X \sim Gamma(\alpha, \beta)\) then \(Y :=\frac{1}{x} \sim \text{Inv-Gamma}(\alpha, \beta)\), with

\(E(Y) = \frac {\beta }{\alpha -1}\), for \(\alpha > 1\)

\(Var(Y) = \frac {\beta ^{2}}{(\alpha -1)^{2}(\alpha -2)}\), for \(\alpha > 2\).

Plug into (2.1),

\[\begin{align} E(Z \mid p, q, \gamma ; \bar{z}, x) &=\int_{0}^{\infty }E(Z|p, \nu)g(\nu |p, q, \gamma, \bar{z},x )d\nu \\ &= \int_{0}^{\infty } \frac{p}{\nu} g(\nu |p, q, \gamma, \bar{z},x )d\nu \\ &= p \cdot E(\frac{1}{\nu}|p, q, \gamma, \bar{z},x ), \\ &\text{note: }\\ &\frac{1}{\nu} \sim \text{Inv-Gamma}(px+q, \gamma+x \bar{z}) \\ &E(\text{Inv-Gamma}) = \frac{\text{scale}}{\text{shape - 1}} \\ &=\frac{p(\gamma+x \bar{z})}{p x+q-1} \tag{2.6} \end{align}\]

3 Understand the result

How to understand \(E(Z \mid p, q, \gamma ; \bar{z}, x)\)? What can we say about the estimator of the individual’s future spend per transaction given his/her historical average spend \(\bar{z}\)?

First, let’s derive \(E(Z \mid p, q, \gamma)\), which is the population mean.

\[ \begin{aligned} E(Z \mid p, q, \gamma) &= \int_{0}^{\infty }E(Z\mid p, \nu)g(\nu \mid q, \gamma)d\nu \\ &= \int_{0}^{\infty } \frac{p}{\nu}g(\nu \mid q, \gamma)d\nu\\ &= pE(\frac{1}{\nu}), \text{ where } \frac{1}{\nu} \sim \text{Inv-Gamma}(q, \gamma)\\ &= \frac{p\gamma}{q-1} \end{aligned} \]

Now let’s rearrange the result of (2.6),

\[\begin{align} E(Z \mid p, q, \gamma ; \bar{z}, x) &= \frac{p(\gamma+x \bar{z})}{px+q-1} \\ &=\left(\frac{q-1}{p x+q-1}\right) \underbrace{\frac{p \gamma}{q-1}}_{\text{population mean}} + \left(\frac{p x}{p x+q-1}\right) \underbrace{\bar{z} \vphantom{\frac{p \gamma}{q-1}}}_{\text{observed average}} \end{align}\]

3.1 Key point

We note that this is the weighted average of the population mean, \(E(Z|p, q, \gamma)\), and the given individual’s observed average transaction value, \(\bar{z}\).

\[ E(Z \mid \bar{z}, x) = w \cdot \{\text{population mean}\} + (1-w) \cdot \{\text{observed average}\} \]

As the number of observations \((x)\) used to compute \(\bar{z}\) increases, less weight is placed on the population mean and more weight is placed on the customer’s observed average.

4 References

[1] Fader PS, Hardie BG (2013). “The Gamma-Gamma Model of Monetary Value.” URL http://www.brucehardie.com/notes/025/gamma_gamma.pdf.

[2] Colombo R, Jiang W (1999). “A stochastic RFM model.” Journal of Interactive Marketing, 13(3), 2-12.

Chen Xing
Chen Xing
Founder & Data Scientist

Enjoy Life & Enjoy Work!

Related