Information Theoretic Properties of the Poisson Binomial Distribution

EE376A (Winter 2019)

by Evan Rosenman


For my final project, I investigated information theoretic properties of the Poisson binomial distribution, the distribution of the sum of independent but not identically distributed Bernoulli random variables.

My project involved three major tasks:

  • Literature review: I reviewed some of the major results regarding the entropy of the Poisson binomial distribution, which cannot be expressed in simple form. Shepp and Olkin conjectured in 1981 that the entropy of a Poisson binomial was a concave function of the success probabilities, but the result was not proven until 2015. Harremoës extended these results to show that the binomial and Poisson distributions are both maximum entropy distributions among suitable sets of Poisson binomial distributions. I review these papers in detail.
  • Functional form exploration: I next shifted to an ecological inference setting. The goal is binary classification, but we only observe binary outcomes summed over sets of individuals, rather than for each individual. The outcomes, then, are distributed as Poisson binomial random variables.
    In the simpler case in which binary outcomes are observed for each individual, logistic regression is a popular model-fitting technique. The logistic regression solutions have a nice property in that the fitted values are both the maximum entropy solutions (subject to covariate balance constraints) and maximum likelihood solutions (assuming a logit link and independent Bernoulli outcomes). I explore whether this property generalizes to the ecological inference case, finding that it does not. However, I do find some interesting connections between the maximum entropy solutions and the Lambert W function.
  • Maximum entropy modeling: Lastly, I compute the maximum entropy and maximum likelihood solutions for a small simulated data set (n = 32). I find that the maximum entropy solutions are much more stable when there is very limited outcome data, which appears to be a significant advantage over maximum likelihood approaches.

A full write-up of my project can be found in my report: Poisson Binomial GLM Entropy Properties


For my outreach activity, I focused on explaining one of Harremoës’ results: a binomial distribution has entropy at least a high as any Poisson binomial distribution with the same number of trials and same mean.

I used two different counting games. In the first game, students were asked to toss two coins and to add up the number of heads. They repeated this task three times. In the second game, students tossed two dice and added up the results. One die had a 5/6 probability of coming up with a “1” and a 1/6 probability of “0”, while the other die had a 1/6 probability of “1” and a 5/6 probability of “0”. Again, the task was repeated three times.

The sum in the coin-tossing task is a Binomial(2, 1/2) random variable while the sum in the dice-tossing task is a PoissonBinomial((1/6, 5/6)) random variable. Hence, they have the same mean and same number of trials, so Harremoës’ result tells us the coin-tossing task has higher entropy. This can also be computed explicitly, by virtue of the fact that both distributions are symmetric but there is a 1/2 chance of getting a “1” as the result of a coin-tossing task but a roughly 72% chance of getting a “1” as the result of the dice-tossing task.

I had the students fill out their results on a form (pictured below), and then asked some leading questions to help them understand the goal of the exercise.


Leave a Reply