Thin shell phenomenon

Exercises from HDP
math
statistics
high-dimensional probability
Author
Affiliation
Published

December 28, 2025

Introduction

Suppose that \(X=(X_1, X_2, \dots, X_n) \in \Rn\) has independent entries with zero mean and unit variance. Since

\[ \E\|X\|^2 = \E \sum X_i^2 = n, \]

as claimed in Prof. Vershynin’s HDP (Vershynin 2018), we should expect \(\Norm{X}\approx \sqrt{n}\). This is made rigorous by the following theorem:

Theorem

Theorem 1 (Theorem 3.1.1 in HDP) Let \(X = (X_1,\dots,X_n) \in \Rn\) be a random vector with independent, subgaussian coordinates \(X_i\) that satisfy \(\E X_i^2 =1\). Then

\[ \SGNorm{\Norm{X} - \sqrt{n}} \leq C K^2, \]

where \(K=\max_i \SGNorm{X_i}\) and \(C\) is an absolute constant.

The subgaussian norm of a subgaussian RV \(X\) is defined as

\[\SGNorm{X} := \inf\{K>0:\ \E \exp(X^2/K^2) \leq 2\}.\]

Remark 3.1.2 in Vershynin (2018) gives an intuitive explanation of the thin shell phenomenon. Indeed, since \(\Norm{X}^2\) has mean \(n\) and standard deviation \(O(\sqrt{n})\), \(\Norm{X}\) should deviate by \(O(1)\) around \(\sqrt{n}\): a Taylor expansion gives \(\sqrt{1+x}=1+O(x)\) for \(x\) small, e.g., \(O(1/\sqrt{n})\), so

If \(\Var({X_i}^2)\) is bounded, we have \(\Var(\Norm{X}^2) = n \Var(X_i^2) = O(n).\)

\[ \sqrt{n \pm O(\sqrt{n})} = \sqrt{n} \sqrt{1 \pm O(1/\sqrt{n})} = \sqrt{n} (1\pm O(1/\sqrt{n})) = \sqrt{n}\pm O(1). \]

The following exercises in the book provide more rigorous justification.


Thin shell for unit variance

Theorem

Theorem 2 (Ex 3.1 in HDP) Let \(X = (X_1, \dots, X_n) \in \R^n\) be a random vector with independent, subgaussian coordinates \(X_i\) that satisfy \(\E X_i^2 = 1\). Then by Theorem 1, we have

\[ \Var (\|X\|_2) \leq C K^4. \]

By the moment property of subgaussian variables applied to \(\Norm{X} - \sqrt{n}\), we have

\[ \E(\Norm{X} - \sqrt{n})^2 \lesssim \SGNorm{\Norm{X} - \sqrt{n}}^2 \lesssim K^4. \]

So

\[ \Var (\Norm{X}) = \E(\Norm{X} - \E \Norm{X})^2 \leq \E(\Norm{X} - \sqrt{n})^2 \lesssim K^4. \]


Thin shell, generalized

Theorem

Theorem 3 (Ex 3.2 in HDP) Let \(X = (X_1, \dots, X_n) \in \R^n\) be a random vector with independent coordinates \(X_i\) that satisfy \(\E X_i^2 = 1\) and \(\E X_i^4 \le K^4\). Show that

\[ \Var (\|X\|_2) \le K^4 \quad \text{and} \quad \sqrt{n} - \frac{K^4}{\sqrt{n}} \le \E\|X\|_2 \le \sqrt{n}. \]

Note that \[\E \Norm{X}^2 = n\] and

\[ \Norm{X} - \sqrt{n} = \frac{(\Norm{X} - \sqrt{n})(\Norm{X} + \sqrt{n})}{(\Norm{X} + \sqrt{n})} = \frac{\Norm{X}^2-n}{\Norm{X}+\sqrt{n}}. \]

So

\[ \begin{aligned} \Var (\Norm{X})&\leq \E( \Norm{X} - \sqrt{n})^2 = \E\left[ \frac{(\Norm{X}^2-n)^2}{(\Norm{X}+\sqrt{n})^2} \right] \leq \frac{1}{n}\E((\Norm{X}^2-n)^2) \\ & =\frac{1}{n} \Var (\Norm{X}^2) = \frac{1}{n} \sum_{i=1}^n \Var(X_i^2) = \frac{1}{n} \sum_{i=1}^n \left(\E X_i^4 - 1\right) \leq K^4 - 1 \leq K^4. \end{aligned} \tag{1}\]

For the upper bound, by Jensen’s inequality, we have

\[ \E \Norm{X} \leq (\E \Norm{X}^2)^{1/2} = \sqrt{n}. \]

For the lower bound, by Equation 1, we have

\[ \E( \Norm{X} - \sqrt{n})^2 = 2n-2\sqrt{n} \E \Norm{X}\leq K^4. \]

Rearranging terms gives the lower bound.


Thin shell, reversed

Theorem

Theorem 4 (Reverse bound (Ex 3.3 in HDP)) Let \(X = (X_1, \dots, X_n) \in \R^n\) be a random vector with independent coordinates \(X_i\) that satisfy \(\E X_i^2 = 1\), \(\Var(X_i^2) > \alpha\) and \(\E X_i^6 \le \beta\) for some \(\alpha, \beta > 0\). Prove that if \(n\) is large enough (depending on \(\alpha\) and \(\beta\)) then

\[ \Var (\|X\|_2) \ge c\alpha \quad \text{and} \quad \E\|X\|_2 \le \sqrt{n} - \frac{c\alpha}{\sqrt{n}}. \]

Consider a Taylor expansion of \(\sqrt{\cdot}\) around \(1\):

\[ \sqrt{z} = 1 + \frac{z-1}{2} - \frac{(z-1)^2}{8} + \frac{(z-1)^3}{16 \xi^{5/2}}, %\frac{(z-1)^3}{16} + O((z-1)^4). %\frac{(z-1)^3}{16 \xi^{5/2}} , \]

where \(\xi \in [ 1 \wedge z, 1 \vee z]\).

Take \(z:=\Norm{X}^2 / n\) and multiply \(\sqrt{n}\) on both sides, for each \(\omega\) in the sample space, we have

\[ \Norm{X}:= \Norm{X(\omega)} = \sqrt{n} + \sqrt{n}\frac{\Norm{X}^2/n-1}{2} -\sqrt{n} \frac{(\Norm{X}^2/n-1)^2}{8} + \sqrt{n}\frac{(\Norm{X}^2/n-1)^3}{16 \xi(X)^{5/2}} \]

for \(\xi(X) \in [ 1 \wedge \Norm{X}^2 / n, 1 \vee \Norm{X}^2 / n]\).

By triangle inequality, \[(\E \vert X_i^2-1\vert^3)^{1/3} \leq \beta^{1/3} + 1.\] So we have \[\E \vert X_i^2-1\vert^3\leq (\beta^{1/3} + 1)^3.\]

We first establish a bound on the third moment. Marcinkiewicz–Zygmund inequality gives

\[ \E \left|\|X\|^2_2 /n - 1\right|^3 = \E \left| \frac{1}{n} \sum_{i=1}^n (X_i^2 - 1)\right|^3 \leq \frac{B_3}{n^3} \E\left[ \left(\sum_{i=1}^n |X_i^2 - 1|^2\right)^{3/2}\right], \]

for some constant \(B_3>0\). Also, Jensen’s inequality gives

\[ \begin{aligned} \left(\sum_{i=1}^n |X_i^2 - 1|^2\right)^{3/2} &= n^{3/2}\left(\frac{1}{n}\sum_{i=1}^n |X_i^2 - 1|^2\right)^{3/2} \\ & \leq \frac{n^{3/2}}{n}\sum_{i=1}^n |X_i^2 - 1|^3 \leq n^{3/2}(\beta^{1/3}+1)^3. \end{aligned} \]

So

\[ \E \left|\|X\|^2_2 /n - 1\right|^3 \leq \frac{B_3}{n^{3/2}} (\beta^{1/3}+1)^3. \]

By the assumption that \(\Var(X_i^2) > \alpha\), we also have

\[ \Var(\Norm{X}^2/n) = \frac{1}{n^2} \sum \Var (X_i^2) \ge \frac{\alpha}{n}. \]

We now bound the expectation of the remainder term. Take \(\varepsilon \in (0,1/2)\), which will be determined later, and consider the event \[A=\{\Norm{X}^2 / n > \varepsilon\}\]. On \(A\), we have \(\xi(X) \ge \varepsilon\). So

\[ \frac{ \Big|\Norm{X}^2/n-1\Big|^3}{16 \xi(X)^{5/2}} \leq \frac{ \Big|\Norm{X}^2/n-1\Big|^3}{16 \varepsilon^{5/2}}. \]

On \(A^c\), we have \(\Norm{X}^2/n < 1\) and \(\xi(X) \in (\Norm{X}^2 / n, 1)\). So \(\vert \Norm{X}^2/n -1 \vert^3 \leq 1\) and \(\xi(X) \leq \varepsilon\), and thus

\[ \frac{ \Big|\Norm{X}^2/n-1\Big|^3}{16 \xi(X)^{5/2}} \leq \frac{1}{16 \varepsilon^{5/2}}. \]

Also, by Markov’s inequality and the fact that \(\varepsilon < 1/2\),

\[ \P \left[ A^c\ \right] = \P \left[-\Norm{X}^2/n \ge -\varepsilon \right] \leq \P \left[\left\vert\Norm{X}^2/n - 1\right\vert \ge 1 - \varepsilon\right] \leq \frac{\E \left\vert \Norm{X}^2/n-1\right\vert^3}{\vert 1-\varepsilon \vert^3} \leq \frac{\E \left\vert \Norm{X}^2/n-1\right\vert^3}{\varepsilon^3}. \]

Combining the previous results, we have

\[ \begin{aligned} \left\vert\E\left[\frac{ (\Norm{X}^2/n-1)^3}{16 \xi(X)^{5/2}}\right] \right\vert & \leq \E\left[\frac{ \Big|\Norm{X}^2/n-1\Big|^3}{16 |\xi(X)|^{5/2}}\right] \\ & = \E\left[\frac{ \Big|\Norm{X}^2/n-1\Big|^3}{16 |\xi(X)|^{5/2}}\mathbf{1}_A\right] + \E\left[\frac{ \Big|\Norm{X}^2/n-1\Big|^3}{16 |\xi(X)|^{5/2}}\mathbf{1}_{A^c}\right] \\ & \leq \frac{1}{16 \varepsilon^{5/2}} \E\Big|\Norm{X}^2/n-1\Big|^3 + \frac{1}{16 \varepsilon^{5/2}} \P\left[\Norm{X}^2/n \leq \varepsilon\right] \\ & \leq \frac{1}{16 \varepsilon^{5/2}} \E\Big|\Norm{X}^2/n-1\Big|^3 + \frac{1}{16 \varepsilon^{11/2}} \E\Big|\Norm{X}^2/n-1\Big|^3 \\ &\leq \frac{1}{16 \varepsilon^{5/2}} \frac{C_1}{n^{3/2}} + \frac{1}{16 \varepsilon^{11/2}} \frac{C_1}{n^{3/2}}. \end{aligned} \]

Pick \(\varepsilon = 1/4 \wedge n^{-1/6}\), we have

\[ \E\left[\frac{ (\Norm{X}^2/n-1)^3}{16 \xi(X)^{5/2}}\right] = o\left(\frac{1}{\sqrt{n}}\right). \]

It then follows that

\[ \begin{aligned} \E\Norm{X} = \sqrt{n} - \frac{\sqrt{n}}{8}\Var(\Norm{X}^2/n) + o\left(\frac{1}{\sqrt{n}}\right) \leq \sqrt{n} - \frac{\alpha}{8\sqrt{n}} \end{aligned} \]

for \(n\) large enough, and thus

\[ \Var(\Norm{X}) = \E\Norm{X}^2-(\E \Norm{X})^2 \geq n- \left(\sqrt{n} - \frac{c\alpha }{\sqrt{n}}\right)^2 = c\alpha + o\left(\frac{1}{\sqrt{n}}\right). \]

Remark

Remark 1 (Why a lower bound on the variance is an essential assumption). If \(\alpha\) can be arbitrarily small, say \(\alpha(n)\to 0\), the variance lower bound can get arbitrarily small as \(n\) gets large.

References

Vershynin, Roman. 2018. High-Dimensional Probability: An Introduction with Applications in Data Science. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press. https://doi.org/10.1017/9781108231596.