Overview

This post analyzes the landmark result by Sergey Bobkov and Mokshay Madiman (Bobkov & Madiman, 2011) regarding the concentration of information in log-concave distributions. The central finding is that for any log-concave random vector , the information functional concentrates around its mean (the entropy) with sub-exponential tails. This property serves as a geometric substitute for independence, enabling the extension of the Shannon-McMillan-Breiman theorem to non-i.i.d. stochastic processes.

🏷️ Foundational Concepts

To understand the concentration results, we define the “surprise” of a distribution and the geometric constraints that stabilize it.

The Information Functional

For a random vector with density , the Information Content (also called “random entropy”) is:

The outcome density determines the “surprise” of a realization. Its average value is the Shannon Entropy, .

Log-Concave Measures

A distribution is log-concave if its density can be written as where is a convex function. This geometric regularity prevents the mass from spreading too thin or having multiple peaks, forcing the distribution into a “well-behaved” shape.

Prefix Notation

For a stochastic process , we define the prefix (or -dimensional projection) as:

This represents the first observations of the process, with denoting their joint density.

🏷️ Main Results: Concentration of Information

The primary contribution of the paper is proving that the random variable stays extremely close to its mean in high dimensions.

Theorem 1.1: Exponential Tail Concentration

If has a log-concave density on , then for all :

where . The fluctuations of information grow only as , matching the Law of Large Numbers.

Corollary: Universal Variance Bound

For any log-concave random vector , the variance of the information content satisfies:

for some universal constant . This confirms that the “random entropy” per coordinate stabilizes at the rate .

Theorem 1.2: Gaussian Regime

For smaller deviations (), the decay is even sharper (Gaussian):

🏷️ Main Proof Strategy

The proof is an elegant multi-stage reduction from high dimensions down to 1D geometry, leveraging the multiscale preservation of log-concavity.

The 1D Baseline: Geometric Profile Stability

In 1D, the information is stable because of the concavity of the profile function , where is the CDF and its inverse. For log-concave , is concave on . Using the identity for any function :

Choosing and normalizing , concavity forces . This allows bounding the 1D MGF:

For , the bound is 4, ensuring for all 1D log-concave distributions.

Reverse Lyapunov Inequalities: The Moment Engine

Standard Lyapunov inequalities state that is convex for any . Bobkov and Madiman use a deep result by Borell (Borell, 1973) to show that for log-concave , the normalized moment function is log-concave. This is derived by considering the volume of convex bodies in and taking the limit . This “reversed” stability ensures that moments of log-concave variables are tightly constrained, which is the key to bounding the fluctuations of .

Log-Concavity of Order and the Trigamma Bound

A density for log-concave has log-concavity of order . The authors prove (Prop 4.1) that the concavity of implies:

where is the trigamma function. Since for large , higher-order log-concavity forces the random information to become extremely stable.

The Localization Engine: Reduction to Weighted Needles

Using the KLS Localization Lemma (Kannan, Lovász & Simonovits, 1995), any integral inequality in is reduced to 1D “weighted needles” with density . The global fluctuation is decomposed as:

  1. The first term is the 1D log-concave baseline (Step 1), contributing .
  2. The second term involves the affine function weighted by , which is log-concave of order (Step 3). Its fluctuations are . Combining these via the convexity of the MGF lifts the needle-variance to the global tail concentration.

🏷️ Application: The Strong Ergodic Theorem (SMB)

The most profound impact of this concentration is the extension of the Shannon-McMillan-Breiman (SMB) theorem.

Corollary 1.3: Extension of SMB

Let be a discrete-time stochastic process with log-concave joint marginals. If the entropy rate exists, then:

Why this extension is transformative

  • Geometry as a Substitute for Statistics: It replaces the traditional requirement of stationarity and mixing with joint convexity.
  • Non-Asymptotic Utility: The concentration provides a universal rate of convergence that mixing-based theorems often lack.
  • High-Dimensional Stability: It proves that any system governed by a convex potential forms a “thin shell” of information in high dimensions, making its long-term average surprise perfectly predictable.
  • Thin Shell Hypothesis: Information concentration is the functional dual to mass concentration. In isotropic convex bodies, mass concentrates near a sphere of radius ; here, surprise concentrates near the entropy .
  • Isotropic Position: When , the theorem implies the density is roughly constant on the -sphere.

🏷️ See Also

📚 References

🐻  Bobkov, S. & Madiman, M. 2011. Concentration of the information in data with log-concave distributions. The Annals of Probability 39(4).
🐻  Borell, C. 1973. Complements of lyapunov’s inequality. Mathematische Annalen 205(4), 323–331.
🐻  Kannan, R., Lovász, L. & Simonovits, M. 1995. Isoperimetric problems for convex bodies and a localization lemma. Discrete & Computational Geometry 13(3), 541–559.