Cumulants and the Cornish-Fisher Expansion

“Consider the following.” (Bill Nye the Science Guy)

There are N random variables drawn from the same kind of probability distribution, but with different parameters for each. In this example, I’ll consider N random variables x_{j} \sim \mathcal{B}(p_{j}), that is, each drawn from a Bernoulli distribution, but having different probabilities p_{j}.

Of interest is a particular sum:

S = \sum_{j=0}^{N-1} w_{j} x_{j}.

where the w_{j} are fixed weights, not all of which are zero. Given the N p_{j} values, and the N weights w_{j}, the problem is to calculate the value of S, say, S_{q}, at a given quantile quantile q of the cumulative distribution of S. Indeed, suppose many such values S_{q} are sought, for various q. How do you do it?

One technique which most data scientists proffer is to simulate many such Bernoulli draws and calculate quantiles empirically from the resulting simulations. This is possible, but the question is how much error is there in such simulations given arbitrary choices of N, the
p_{j}, the w_{j}, and q? Also, how does the error behave as the number of simulated draws increases? Does it decrease monotonically?

There is another approach: Using the Cornish-Fisher expansion, a technique dating from 1937. But before meeting the expansion, cumulants need to be introduced.

Cumulants

Technically, cumulants are the coefficients obtained from a series expansion of the log of the characteristic function of a probability density function. That’s dense. Let’s break it down.

The characteristic function is the expected value of a particular non-linear function of a random variable X, where X \sim P(x), and P(x) is the probability density function for X. That non-linear function is

\varphi_{X}(t) = E\llbracket \exp{(\mathbf{i} t X)} \rrbracket.

where \mathbf{i} = \sqrt{-1}, the complex basis. Using the definition of expected value, and assuming the domain for X is continuous,

\varphi_{X}(t) = E\llbracket \exp{(\mathbf{i} t X)} \rrbracket = \int_{-\infty}^{\infty} \exp{(\mathbf{i} t X)} P(x)\,dx.

If X were discrete, then

\varphi_{X}(t) = E\llbracket \exp{(\mathbf{i} t X)} \rrbracket = \frac{\sum_{x \in \mathcal{D}(X)} \exp{(\mathbf{i} t X)} P(x)}{\sum_{x \in \mathcal{D}(X)} P(x)}.

where \mathcal{D}(X) denotes the (finite) domain of the random variable X.

Inspection shows these expressions are the value of a general Fourier transform of the probability density function, P(x), one having constants a=1, b=1.

Taking the log, then

\log{\varphi_{X}(t) } = \sum_{n=1}^{\infty} \kappa_{n} \frac{(\mathbf{i} t)^{n}}{n!}.

where (finally!) the \kappa_{1}, \kappa_{2}, \dots are the cumulants. There is a cumulant generating function definable,

\mathbf{K}(h) = \log{\mathbf{M}(h)} = \sum_{n=1}^{\infty} \kappa_{n} \frac{h^{n}}{n!}.

roughly analogous to the moment generating function, or

\mathbf{M}_{X}(t) = E\llbracket \exp{(t X)} \rrbracket = \sum_{n=1}^{\infty} \frac{t^{n} E\llbracket X^{n} \rrbracket}{n!} = \sum_{n=1}^{\infty} \frac{t^{n} \mu_{n}}{n!}.

where \mu_{n} denotes the nth moment of X. In fact, moments and cumulants are interrelated and can be obtained from one another. This can be useful. Another fact, particularly useful for the problem at hand, is that if \kappa_{r}(Y) denotes the rth cumulant of some random variable Y and that random variable is itself the sum of m other random variables, per

Y = \sum_{k=1}^{m} X_{k}.

then

\kappa_{r}(Y) = \kappa_{r}(\sum_{k=1}^{m} X_{k}) = \sum_{k=1}^{m} \kappa_{r}(X_{k}).

assuming cumulants exist for all X_{k}. What this means is that to the degree to which the probability distribution of a sum of random variables can be characterized by or even recovered from knowledge of its cumulants, we can readily find the cumulants of such a sum from the cumulants of each component of the sum. If moment generating functions are used, or simply probability densities are used, one can calculate the same, but it entails convolutions of densities or products of characteristic functions. These interchanges are really just consequences of the product-convolution identity for Fourier transforms.

This just sketches the beginning of things that can be done with cumulants. See k-statistics and polykays for others, even empirical uses.

The Cornish-Fisher expansion

Given knowledge of cumulants, the Cornish-Fisher expansion gives an estimate for S_{q} directly from the cumulants of S, assuming these exist. In the case of the weighted Bernoulli sum here, they certainly do.

Cumulants for S

The first five cumulants for each part, w_{j} x_{j} of the sum S are:

\kappa_{1,j} = p_{j} w_{j}.
\kappa_{2,j} = p_{j} (1 - p_{j}) w_{j}^{2}.
\kappa_{3,j} = p_{j} (p_{j} - 1) (2 p_{j} - 1) w_{j}^{3}.
\kappa_{4,j} = p_{j} (1 - p_{j}) (1 + 6 p_{j} (p_{j} - 1)) w_{j}^4.
\kappa_{5,j} = p_{j} (p_{j} - 1) (2 p_{j} - 1) (1 + 12 p_{j} (p_{j} - 1)) w_{j}^{5}.

These are elemental cumulants and, so, to obtain the corresponding cumulants for N pairs of w_{j} and p_{j} where, recall, x_{j} \sim \mathcal{B}(p_{j}),

\kappa_{k} = \sum_{j=0}^{N-1} \kappa_{k,j}.

Note that

  • \kappa_{1} corresponds to the mean.
  • \kappa_{2} corresponds to the variance.
  • \kappa_{3} corresponds to skewness, hereinafter denoted \mathcal{S}.
  • \kappa_{4} corresponds to kurtosis, hereinafter denoted \mathcal{K}.
  • \kappa_{5} does not correspond to any central moment or simple combination of them.

Details are available here.

A Cornish-Fisher expansion for S

library(mpoly)
theoreticalMean<- function(W,P) sum(W*P)
theoreticalVariance<- function(W,P) sum(W^2*P*(1-P))
theoreticalSkew<- function(W,P) sum(P*(P-1)*(2*P-1)*W^3)
# (This is not excess kurtosis. Still need to subtract 3 to get kurtosis w.r.t. Gaussian.)
theoreticalKurtosis<- function(W,P) sum(P*(1-P)*(1 + 6*P*(P - 1))*W^4)
theoreticalKappa5<- function(W,P) sum(P*(P-1)*(2*P-1)*(1 + 12*P*(1-P))*W^5)

gamma1<- function(W,P) theoreticalSkew(W,P)/sqrt(theoreticalVariance(W,P)^3)
gamma2<- function(W,P) theoreticalKurtosis(W,P)/sqrt(theoreticalVariance(W,P)^4)
gamma3<- function(W,P) theoreticalKappa5(W,P)/sqrt(theoreticalVariance(W,P)^5)

He.polynomials<- hermite(degree=1:4, kind="he", normalized=TRUE)
ThePoint<- 0.10
TheQuantile<- qnorm(ThePoint)

yAtThePoint<- function(xPoint=0.1, W, P) 
{
  #
  xQuantile<- qnorm(xPoint)
  #
  He<- unlist(sapply(X=He.polynomials, FUN=function(He.k) as.function(He.k, silent=TRUE)(xQuantile)))
  #
  h1<- He[2]/6
  h2<- He[3]/24
  h11<- - (2*He[3]+He[1])/36
  h3<- He[4]/120
  h12<- -(He[4] + He[2])/24
  h111<- (12*He[4] + 19*He[2])/324
  #
  g1<- gamma1(W,P)
  g2<- gamma2(W,P)
  g3<- gamma3(W,P)
  #
  mu<- theoreticalMean(W,P)
  sigma<- sqrt(theoreticalVariance(W,P))
  #
  w<- xQuantile + (g1*h1) + (g2*h2 + g1^2*h11) + (g3*h3 + g1*g2*h12 + g1^3*h111)
  #
  yp<- mu + sigma*w
  #
  return(yp)
}

Simulating S using many Bernoulli draws

A simulation for S was developed wherein a vector, \mathbf{p}, of probabilities were drawn from a Beta distribution, a vector of non-negative weights, \mathbf{w}, were drawn from a Gamma distribution, and then, governed by \mathbf{p}, M = 10000 vectors x_{k}, k \in \{1, \dots, M\}, were drawn from the Bernoulli distribution. Then, S_{k} = w_{k} \cdot x_{k} was calculated for each and saved. N, the length of these vectors, was chosen to be 10, typical for the application described motivating this problem. It also suffices for illustration. An estimated probability density was developed from the M-sized collection of saved sums. Thirty of these are shown below.

Comparing the two approaches

S_{q=0.1} was a quantile chosen of interest, and used for comparison. In addition to comparison with the numerical simulation, the PDQutils package of R was used to independently calculate the value for these quantiles as a check on derivations. Empirical quantiles were obtained from the simulations using the hdquantile function from the Hmisc package of R. The results are given below.

Comment on Cumulants and the Cornish-Fisher expansion in statistical education

Cumulants and the Cornish-Fisher expansion aren’t regularly taught any longer in courses on theoretical statistics, at least judging by material in statistics textbooks. For example,

  • J. S. Bendat, A. G. Piersol, Random Data: Analysis and Measurement Procedures, 4^{th} edition, Wiley, 2010
  • A. W. Drake, Fundamentals of Applied Probability, McGraw-Hill Book Company, 1967
  • J. A. Rice, Mathematical Statistics and Data Analysis, 3^{rd} edition, Duxbury, Thomson/Brooks/Cole, 2007
  • M. H. DeGroot, M. J. Schervish, Probability and Statistics, 3^{rd} edition, Addison-Wesley, 2002
  • K. V. Mardia, J. T. Kent, J. M. Bibby, Multivariate Analysis, Academic Press, 1979
  • R. Durrett, Probability: Theory and Examples, 4^{th} edition, Cambridge University Press, 2010

lack any mention of either cumulants or the expansion, although moment-generating functions are inevitably mentioned. In contrast,

  • R. V. Hogg, J. W. McKean, A. T. Craig, Introduction to Mathematical Statistics, 6^{th} edition, Pearson/Prentice-Hall, 2005
  • A. N. Shiryaev, R. P. Boas (translator), Probability, 2^{nd} edition, Springer, 1996
  • E. Parzen, Modern Probability Theory and Its Applications, John Wiley, 1960
  • D. E. Knuth, The Art of Computer Programming, Volume 1: Fundamental Algorithms, 3^{rd} edition, Addison-Wesley, 1997

do mention cumulants. Parzen (1960) has quite a bit about them, and relationships to sums of random variables. The erudite Professor Knuth has three pages mentioning them. That suggests, however, that most specialists in computer science ought to know of them, but I’d bet most don’t.

Cumulants are often mentioned during courses on random walks and diffusion, because a random walk can be considered a kind of a sum.

Postscript

Should readers be interested in the code, I can prepare a documented version of it to use. Note that this would be in the interest of reproducibility. As mentioned above, the PDQutils package of R offers a means of calculating quantiles given cumulants directly, via its function qapx_cf. Of course, theoretical expressions for quantiles demand symbolic calculation, such as offered by the capabilities of Wolfram Mathematica.

About ecoquant

See https://wordpress.com/view/667-per-cm.net/ Retired data scientist and statistician. Now working projects in quantitative ecology and, specifically, phenology of Bryophyta and technical methods for their study.
This entry was posted in Calculus, closed-form expressions, Cornish-Fisher expansion, cumulants, descriptive statistics, mathematics, maths, multivariate statistics, statistical models, statistics, theoretical statistics. Bookmark the permalink.

Leave a Reply