Concentration of independent sums

Exercises from HDP

$$ \newcommand{\E}{\mathbb{E}} \newcommand{\R}{\mathbb{R}} \newcommand{\Rn}{\mathbb{R}^n} \DeclareMathOperator{\Var}{Var} \newcommand{\Norm}[1]{\left\|#1\right\|_2} \newcommand{\LONorm}[1]{\left\|#1\right\|_{L^1}} \newcommand{\LINorm}[1]{\left\|#1\right\|_{L^\infty}} \newcommand{\LpNorm}[1]{\left\|#1\right\|_{L^p}} \newcommand{\SGNorm}[1]{\left\|#1\right\|_{\psi_2}} \newcommand{\Prob}[1]{\mathbb{P}\left\{#1\right\}} $$

Let $X,X_1,X_2, \dots, X_N$ be i.i.d. mean-zero, subgaussian random variables. Prove that

\[\begin{equation} \SGNorm{\sum^N_{i=1} X_i} \asymp \sqrt{N}\SGNorm{X}. \label{eq:target} \end{equation}\]

Deduce that the binomial random variable $S_N \sim \text{Binomial}(N, p)$ satisfies

\[\SGNorm{S_N - Np} \asymp \sqrt{\frac{N}{\log(2/p)}}.\]

Since $X, X_1, X_2, \dots, X_N$ are i.i.d., they have identical MGF and hence

\[\begin{aligned} \E [e^{\lambda \sum X_i}] = \prod_{i=1}^N \E [e^{\lambda X_i}] = \left(\E [e^{\lambda X}\right] )^N \leq e^{C \lambda^2 \SGNorm{\sum X_i}^2}. \end{aligned}\]

So

\[\E[e^{\lambda X}] \leq \E\left[e^{C \lambda^2 \SGNorm{\sum X_i}^2/N}\right]= \E\left[e^{C \lambda^2 \SGNorm{\frac{1}{\sqrt{N}}\sum X_i}^2}\right].\]

By definition of $\SGNorm{X}$, we thus have

\[\SGNorm{X} \leq \frac{C}{\sqrt{N}} \SGNorm{\sum_{i=1}^N X_i}.\]

THe other direction follows from the fact that

\[\SGNorm{\sum_{i=1}^N X_i}^2 \leq C \sum_{i=1}^N \SGNorm{X_i}^2 = C N \SGNorm{X}^2\]

and taking square root on both sides.

Note that $S_N= \sum_{i=1}^N Z_i$, where $Z\sim \text{Bernoulli}(p)$. So

\[\SGNorm{Z_i} = \frac{1}{\sqrt{\log(1+1/p)}} \asymp \frac{1}{\sqrt{\log (2/p)}},\]

since for $p\in(0,1)$, $\log(1/p) \leq \log(1+1/p) \leq \log(2/p)$.

Apply $\eqref{eq:target}$ gives the rate for Binomial distributions.

We have

\[\LpNorm{X} \leq \LONorm{X}^\frac{1}{p} \LINorm{X}^{1-\frac{1}{p}}.\]

Prove that if $\LONorm{X} = a$ and $\LINorm{X} = b$, then

\[\SGNorm{X} \leq \frac{Cb}{\sqrt{\log(2b/a)}}.\]

Give an example where the bound is tight up to an aboslute constant, for any $a,b$.

\[\SGNorm{X} \asymp \sup_{p\geq 1} \LpNorm{X}/\sqrt{p} \leq \sup_{p\geq 1} a^{1/p} b^{1-1/p} / \sqrt{p}.\]

Take the log of the right hand side, we want to maximize $\frac{1}{p}\log(b/a)-\frac{1}{2}\log p$ for $p$, giving us $p^*= 2\log(b/a)$.

So

\[\SGNorm{X} \leq \frac{C_0 b (b/a)^{-\frac{1}{2\log(b/a)}}}{\sqrt{2\log(b/a)}} = \frac{C_0 b e^{\log(b/a) -2\log(b/a)}}{\sqrt{2\log(b/a)}} \leq \frac{C_1 b}{\sqrt{2\log(b/a)}} \leq \frac{C b}{\sqrt{\log(2b/a)}},\]

where the factor of 2 inside $\log$ is to ensure the denominator is nonzero when $b=a$.

Let $X_1,X_2,\dots$ be a sequence of subgaussian random variables, not necessarily independent. Then

\[\SGNorm{\sup_k \frac{X_k}{\sqrt{\log 2k}}} \leq C \sup_k \SGNorm{X_k},\]

where $C$ is an absolute constant.

\[\begin{aligned} \Prob{\sup_{k} \frac{X_k}{\sqrt{\log 2k}} \geq t} &\leq \sum_{k=1}^\infty\Prob{\frac{X_k}{\sqrt{\log 2k}} \ge t} \\ & \leq \sum_{k=1}^\infty 2 \exp(-ct^2 \log 2k / \SGNorm{X_k}) \\ & \leq \sum_{k=1}^\infty 2 \exp(-ct^2 \log 2k / \sup_k \SGNorm{X_k}) \\ & = \sum_{k=1}^\infty 2 (2k)^{-ct^2/\sup_k \SGNorm{X_k}} \\ & \leq C \exp(-c\log(2) t^2 / \sup_k \SGNorm{X_k}) \qquad \text{pick $t$ such that the series converge.} \end{aligned}\]

Let $g_,\dots,g_N$ be $N(0,1)$ random variables for some $N \ge 2$, that are not necessarily independent. Then

\[\E\max_{i\le N} X_i \leq \sqrt{2\ln N}, \quad\quad \E\max_{i\le N} |X_i| \leq \sqrt{2\ln 2N}.\]

If they are independent,

\[\E\max_{i\le N} X_i = \sqrt{2\ln N}(1+o(1)), \quad\quad \E\max_{i\le N} |X_i| =\sqrt{2\ln 2N} (1+o(1)).\]

For all $\lambda>0$,

\[\begin{aligned} \E \max_i X_i &= \frac{1}{\lambda}\ln e^{\E \lambda \max_i X_i} \\ &\leq \frac{1}{\lambda} \ln\left(\sum_{i=1}^N \E e^{\lambda X_i} \right)\\ &= \frac{1}{\lambda} \ln \left(N e^{\lambda^2/2}\right) \\ &= \frac{\ln N}{\lambda}+ \frac{\lambda}{2}. \end{aligned}\]

We optimize the bound via $\frac{\ln N}{\lambda} \asymp \frac{\lambda}{2}$, which gives $\lambda = \sqrt{2 \ln N}$ and the desired bound.

The second bound follows the same procedure by considering \(X_1, \dots, X_N, -X_1, \dots, -X_N\). Indeed, \(\vert X_i\vert=\max \{X_i, -X_i\}\).

Proposition 2.1.2 gives

\[\Prob{g\ge t} \ge \frac{t}{t^2+1}\frac{1}{\sqrt{2\pi}}e^{-t^2/2}.\]

Take $t=\sqrt{(1-\varepsilon) 2 \ln N}$.

\[\begin{aligned} \Prob{\max X_i < t} &= \Prob{g < t}^N = (1-\Prob{g \ge t})^N \\ & \leq \left(1-\frac{\sqrt{(1-\varepsilon) 2 \ln N}}{(\sqrt{(1-\varepsilon) 2 \ln N})^2+1}\frac{1}{\sqrt{2\pi}}e^{-(\sqrt{(1-\varepsilon) 2 \ln N})^2/2}\right)^N \\ &= \left(1-\frac{\sqrt{(1-\varepsilon) 2 \ln N}}{(1-\varepsilon) 2 \ln N+1}\frac{1}{\sqrt{2\pi}}N^{-(1-\varepsilon)}\right)^N \\ &\leq \exp\left\{-\frac{\sqrt{(1-\varepsilon) 2 \ln N}}{(1-\varepsilon) 2 \ln N+1}\frac{1}{\sqrt{2\pi}}N^{1-(1-\varepsilon)}\right\} = o(1) \end{aligned}\]

for $\varepsilon\in (0,1)$, since $1-x\leq e^{-x}$. Hence

\[\Prob{\max X_i \ge \sqrt{(1-\varepsilon)2 \ln N}} =1-o(1),\]

and

\[\begin{aligned} \E \max_i X_i &= \E \max_i X_i \boldsymbol{1}\{\max X_i \ge \sqrt{(1-\varepsilon)2 \ln N}\} \\ &\quad\quad \text{ }+ \E \max_i X_i \boldsymbol{1}\{\max X_i < \sqrt{(1-\varepsilon)2 \ln N}\} \\ &\ge \sqrt{(1-\varepsilon)2 \ln N} \Prob{\max X_i \ge \sqrt{(1-\varepsilon)2 \ln N}} \\ &\quad\quad \text{ }+ \E X_i \boldsymbol{1}\{\max X_i < \sqrt{(1-\varepsilon)2 \ln N}\} \\ &\ge \sqrt{(1-\varepsilon)2 \ln N} \Prob{\max X_j \ge \sqrt{(1-\varepsilon)2 \ln N}} \\ &\quad\quad \text{ } - \E |X_j| \boldsymbol{1}\{\max X_i < \sqrt{(1-\varepsilon)2 \ln N}\} \\ &\ge \sqrt{(1-\varepsilon)2 \ln N} \Prob{\max X_j \ge \sqrt{(1-\varepsilon)2 \ln N}} \\ &\quad\quad \text{ } - \Var(X_i) \Prob{\max X_i < \sqrt{(1-\varepsilon)2 \ln N}} \\ & = \sqrt{(1-\varepsilon)2 \ln N} (1-o(1)) - o(1). \end{aligned}\]

Since

\[\E \max_i X_i \ge \sqrt{(1-\varepsilon) 2\ln N}(1+o(1)) \quad \forall \varepsilon>0\; \text{small enough},\]

we have

\[\E \max_i X_i \ge \sqrt{2\ln N}(1+o(1))\]

as $N\to \infty$.

The upper bound is given as before.

For the other bound, we have

\[\E \max_i |X_i| \ge \E \max_i X_i \ge \sqrt{2\ln N}(1+o(1)),\]

and

\[\begin{aligned} \E \max_i |X_i| \leq \sqrt{2 \ln 2N} &= \sqrt{2 \ln 2 + 2\ln N} = \sqrt{2 \ln N} \sqrt{1+\frac{\ln 2}{\ln N}} \\ & \leq \sqrt{2 \ln N}(1 + \sqrt{\frac{\ln 2}{\ln N}})=\sqrt{2 \ln N}(1 + o(1)), \end{aligned}\]

since $\frac{\ln 2}{\ln N} = o(1)$.

If

\[\E \max_{i \leq N}|X_i| \leq K \sqrt{\log N}\quad \text{for any } N=1,2,\dots,\]

where $X_i$ are indpendent copies of $X$, then $X$ is subgaussian and $\SGNorm{X} \leq CK$.

We want to show that

\[\Prob{∣X∣>t}\leq 2\exp(−ct^2/K^2),\]

First,

\[\Prob{\max |X_i| > t} \leq \frac{\E \max_i |X_i|}{t} \leq \frac{K}{t}\sqrt{\log N}.\]

Let $t=2K\sqrt{\log N}$,

\[\Prob{\max |X_i| \le t} =(1-\Prob{|X_i| \ge t})^N\ge \frac{1}{2}.\]

Then

\[\Prob{|X_i| \ge t} \le 1-2^{-1/N} \leq \frac{\ln 2}{N},\]

since $1-e^{-x} \leq x$. Because we pick

\[N \ge e^{\frac{t^2}{4K^2}},\]

we have

\[\Prob{|X_i| \ge t} \leq \ln 2 e^{-\frac{t^2}{4K^2}} \leq 2 e^{-\frac{t^2}{4K^2}}.\]