Recall that a random variable X with mean μ is sub-Gaussian with constant σ if its moment generating function is bounded above as follows:
E[eλ(X−μ)]≤eλ2σ2/2∀λ∈R.(1)
That proved useful because, if X is sub-Gaussian, then we showed that the tail of its distribution is dominated by a Gaussian random variable’s with parameters μ and σ2. In particular, we are able to apply the Chernoff bound to sub-Gaussian random variables and derive Hoeffding inequality-type bounds on the tail distribution.
Sub-Gaussianness, however, is somewhat restrictive. It requires that the bound on the moment generating function hold for all values of λ! We can relax that requirement just a little bit and demand that the inequality hold for small values of λ around 0 instead:
E[eλ(X−μ)]≤eλ2ν2/2∀∣λ∣≤b1.(2)
A random variable X with mean μ for which inequality (2) holds is said to be sub-exponential with parameters (ν,b) for some non-negative b (and defining 1/0=∞).
Before we continue, it’s worth mentioning that sub-exponentialness is preserved under summation of independent random variables, just as sub-Gaussiannes is. In other words, if we have n sub-exponential random variables Xi’s with parameters (νi,bi), then their sum X=∑iXi is sub-exponential:
which is valid when ∣λ∣≤1/maxibi. So its parameters are ((∑iνi2)1/2,maxibi).
Equivalent Definitions
There are other ways of characterizing a sub-exponential random variable that are equivalent to the definition in (2). For example, if the moment generating function of X is finite for small values of λ, then X is indeed a sub-exponential variable! This, and other definitions, are presented in the result below.
The following definitions are equivalent for a centered random variable X:
i. There exist constants ν and b such that inequality (2) holds.
ii. There exists a constant c0 such that E[eλX] is finite for ∣λ∣≤c0.
iii. There exist constants c1 and c2 such that P[∣X∣≥t]≤c1e−c2t for all t>0.
iv. The quantity supλ≥2(k!E[Xk])1/k is finite.
Proof. (i) ⇔ (ii) is straightforward.
(ii) ⇒ (iii): Applying the Chernoff bound to X gives:
when c2/λ≥2. Because E[e−λX]≤E[e∣λ∣∣X∣] and E[eλX]≤E[e∣λ∣∣X∣], we can conclude that E[eλX] is finite for ∣λ∣≤c2/2.
(ii) ⇔ (iv): We can argue using the power series expansion of the moment generating function:
E[eλX]=1+k=2∑∞k!E[Xk]λk.
Notice that the inverse of limsupλ≥2∣∣∣∣k!E[Xk]∣∣∣∣1/kis the convergence radius of the series. As a result, if (ii) holds, then the series must converge, implying that the quantity in (iv) must be finite. If (iv) holds, then the series must converge, implying that the quantity in (ii) must be finite for within the convergence radius.
Bernstein’s Condition
An alternative way to characterize sub-exponential random variables is by bounding the polynomial moments of X, instead of its moment generating function. Concretely, if X is a random variable with mean μ and variance σ2, then we can show that, if the following condition, known as Bernstein’s condition with parameterb, holds, then X is a sub-exponential random variable:
∣∣∣∣E[(X−μ)k]∣∣∣∣≤21k!σ2bk−2for k=3,4,…
Proof. Let’s start from the definition of a sub-exponential random variable and expand the left-hand-side of (2):
when ∣λ∣<1/2b. That implies that X is sub-exponential with parameters (2σ,2b).
This is a neat result! For example, it is straightforward to see why Bernstein’s condition holds for bounded random variables with the property that ∣X−μ∣≤b. We proved previously that such random variables are sub-Gaussian. It turns out, they are sub-exponential too!
Concentration Inequalities
Why is this notion of sub-exponentialness useful anyway? Well, for one thing, the family of sub-exponential random variables includes sub-Gaussian random variables. But more importantly, limiting our focus to a neighborhood around 0 enables us to derive more expressive bounds on the tail distribution of a sub-exponential X. To see why, let’s first derive the basic tail bound and then discuss a more elaborate bound that contrasts the concentration of sub-Gaussian and sub-exponential variables.
Sub-Exponential Tail Bound
For a sub-exponential random variable X with parameters (ν,b), the following holds:
When t≤ν2/b, we can solve the unconstrained minimization problem and obtain:
∂λ∂f(λ,t)=0⇒ν2λ−t=0⇒λ=ν2t.
Substituting this value of λ into f(λ,t) gives the desired result, as in the case of a sub-Gaussian random variable.
On the other hand, when t>ν2/b, then f(λ,t) reaches its minimum at the boundary point when λ=1/b. That’s because f(⋅,t) is monotonically decreasing in its first argument. Evaluating f(1/b,t) then leads to:
We can do something even more interesting if we consider Bernstein’s condition! If we stare at the proof of the statement above that Bernstein’s condition is sufficient to show that a random variable is sub-exponential, we notice an interesting result hiding in there. In particular, we derived, in that proof, the following bound when ∣λ∣b<1:
E[eλ(X−μ)]≤eλ2σ2/2(1−∣λ∣b).
Setting λ=t/(bt+σ2)∈[0,b1) results in the following concentration inequality:
P[X−μ≥t]≤e−t2/2(σ2+bt).(4)
Why is that significant? Consider a bounded random variable X such that ∣X−μ∣≤b. We argued above that X is sub-exponential as it meets Bernstein’s condition. The tail bound in (4) then introduces a dependence on, not just b, as the Hoeffding inequality would (as a consequence of X being sub-Gaussian), but also on the variance of X!
Because σ2≤b2 for a bounded random variable, and for suitably small values of t, (4) is never worse than the Hoeffding bound. In particular, when σ2≪b2, then (4) offers a much sharper bound on the concentration of X.
Example Sub-Exponential Variables
We have already discussed the case of bounded random variables. Indeed, such variables are sub-Gaussian and sub-exponential. Let’s now look at a random variable that’s sub-exponential but not sub-Gaussian: the χ2 random variable.
Recall that if Y∼N(0,1) is a Gaussian random variable, then X=Y2 is a χ2 random varaible whose mean is 1. Let’s inspect its moment generating function:
E[eλ(X−1)]=2π1∫−∞∞eλ(y2−1)e−y2/2dy=1−2λe−λ.
Obviously, when λ>1/2 the moment generating function does not exist! When ∣λ∣≤1/4, then we can show that the moment generating function is bounded above by e2λ2. As such, X is sub-exponential with parameters (2,4).
As another example, consider the sum of n iid χ2 random variables. Because each variable is itself a sub-exponential with parameters (2,4), we know from before that their sum, too, is sub-exponential with parameters (2n,4). As a result, we can derive the following bound: