Empirical Bayes Confidence Intervals

Constructing confidence intervals for EB estimates
empirical bayes
bayesian statistics
statistics
math
Author
Affiliation
Published

April 22, 2025

In progess. This note summarizes the methods proposed by (Ignatiadis and Wager 2022).

Setup

Suppose we have following model

\[ \begin{equation} Z_i = \mu_i + \epsilon_i, \hspace{1cm} \mu_i \stackrel{iid}{\sim} G\in \mathcal{G}, \hspace{1cm} \epsilon_i \stackrel{iid}{\sim} \mathcal{N}(0,1), \end{equation} \]

for some prior class \(\mathcal{G}\). We are interested in estimating \(\theta_G(z)=\mathbb{E}_G[h(\mu)\mid Z=z]\) for some function \(h\) and forming confidence intervals for the estimates. Ignadiatis and Wager propose two approaches for this task: \(F\)-localization and AMARI.

\(F\)-localization

Idea: find a level-\(\alpha\) set of distribution functions \(\mathcal{F}_n(\alpha)\) for the true marginal distribution function \(F_G\), i.e.,

\[ \lim \inf_{n\to \infty} \mathbb{P}_G\Big[ F_G \in \mathcal{F}_n(\alpha)\Big] \geq 1-\alpha. \]

For \(z\) of interest, identify all \(\hat{G}\) such that \(F_{\widehat{G}}\) lies inside \[\mathcal{F}_n(\alpha)\]. Now that we have a set of \(\theta_{\widehat{G}}(z)\)’s, we can simply take and maximum and minimum to form a confidence band.

AMARI

Prerequisite: bias-aware confidence intervals

Suppose we have some estimate \(\hat{m}\) of some functional \(m \in \mathcal{M}\) with standard error \({se}(\hat{m}(x_0))\). Denote the worse case bias by \(B = \text{sup}_{m \in \mathcal{M}}\|\hat{m}(x_0) - m(x_0)\|\). Then a naive confidence interval for \(m(x_0)\) is

\[ \hat{m}(x_0) \pm (B + t_{1-\alpha/2} se(\hat{m}(x_0))). \]

To make the length of the CI as short as possible, a better one is given by

\[ \hat{m}(x_0) \pm t_\alpha (se(\hat{m}(x_0)), B), \]

where

\[ t_\alpha (se(\hat{m}(x_0)), B) = \inf \big\{ t > 0: \forall |b| \leq B: \mathbb{P}[|Z_b| \leq t] \geq \alpha\big\}, \]

and \(Z_b \sim \mathcal{N}(b, se(\hat{m}(x_0))^2)\).

Non-identifiability in the Bernoulli Model

Consider the model \(Z_i \mid \mu_i \sim \text{Bernoulli}(\mu_i)\), and \(\mu_i\sim G\in\mathcal{P}([0,1])\). Then we have \(p(Z_i \mid \mu_i) = \mu_i^{z_i}(1-\mu_i)^{1-z_i}\), \(f_G(1) = \int \mu dG(\mu)\), and \(f_G(0) = 1 - f_G(1)\). So the marginal distribution \(f_G\) of \(Z\) is determined by \(f_G(1)\).

The second moment \(L(G)=\int \mu^2 dG(\mu)\) is not point-identified. Different priors \(G\) with the same \(f_G(1)\), $\(L(G)\) could have different \(L(G)\). The maximum is attained at \(G\) such that \(\PP_G[\\{1\\}]=f_G(1)\) and \(\PP_G[\{0\}] = f_G(0)\). In this case, \(L(G) = \text{Var}_G(\mu)+(f_G(1))^2 = f_G(1)\). The minimum is attained at \(\PP_G[\\{f_G(1)\\}]=1\), with \(L(G) = f_G(1)^2\).

References

Ignatiadis, Nikolaos, and Stefan Wager. 2022. “Confidence Intervals for Nonparametric Empirical Bayes Analysis.” J. Am. Stat. Assoc. 117 (539): 1149–66.