2026
arXiv preprint arXiv:2605.08069
Ling, Wanyi, Li, Sida, Junming Guan, Ignatiadis, Nikolaos
Abstract
We study methods for simultaneous analysis of many noisy and biased estimates, each paired with an even noisier estimate of its own bias. The analyst's goal is to construct short calibrated intervals for each parameter. The standard debiasing approach, which subtracts the bias estimate from each biased estimate, inflates variance and yields long intervals. In this paper, we propose an empirical Bayes rebiasing strategy that starts from the fully debiased estimates and learns from data how much bias to reintroduce by estimating the unknown bias distribution. We provide convergence rates for the coverage of our intervals when the bias distribution is estimated using nonparametric maximum likelihood. Furthermore, we demonstrate substantial precision gains in prediction-powered inference, including pairwise LLM win-rate evaluations, as well as for inference of direct genetic effects in family-based GWAS.
DOI
10.48550/arXiv.2605.08069
BibTeX
@article{Ling2026-eb-rebiasing,
title = {Empirical Bayes Rebiasing},
author = {Ling, Wanyi and Li, Sida and Guan, Junming and Ignatiadis, Nikolaos},
journal = {arXiv preprint arXiv:2605.08069},
year = {2026},
doi = {10.48550/arXiv.2605.08069},
url = {https://arxiv.org/abs/2605.08069},
abstract = {We study methods for simultaneous analysis of many noisy and biased estimates, each paired with an even noisier estimate of its own bias. The analyst's goal is to construct short calibrated intervals for each parameter. The standard debiasing approach, which subtracts the bias estimate from each biased estimate, inflates variance and yields long intervals. In this paper, we propose an empirical Bayes rebiasing strategy that starts from the fully debiased estimates and learns from data how much bias to reintroduce by estimating the unknown bias distribution. We provide convergence rates for the coverage of our intervals when the bias distribution is estimated using nonparametric maximum likelihood. Furthermore, we demonstrate substantial precision gains in prediction-powered inference, including pairwise LLM win-rate evaluations, as well as for inference of direct genetic effects in family-based GWAS.}
}
2025
arXiv preprint arXiv:2512.13622
Chen, Hunter, Junming Guan, van Zwet, Erik, Ignatiadis, Nikolaos
Abstract
We develop a statistical framework for empirical Bayes learning from selectively reported confidence intervals, and apply it to provide context for interpreting results published in MEDLINE abstracts. We use a collection of 326,060 z-scores from MEDLINE abstracts (2000-2018) as the input for an empirical Bayes analysis, with publication bias as a key methodological challenge. We address publication bias through a selective tilting approach that extends empirical Bayes confidence intervals to truncated sampling. Our framework provides coverage guarantees for functionals including posterior estimands describing idealized replications and the symmetrized posterior mean, which we justify decision-theoretically as optimal among sign-equivariant (odd) estimators.
DOI
10.48550/arXiv.2512.13622
BibTeX
@article{Chen2025-selective-ci,
title = {Empirical Bayes learning from selectively reported confidence intervals},
author = {Chen, Hunter and Guan, Junming and van Zwet, Erik and Ignatiadis, Nikolaos},
journal = {arXiv preprint arXiv:2512.13622},
year = {2025},
doi = {10.48550/arXiv.2512.13622},
url = {https://arxiv.org/abs/2512.13622},
abstract = {We develop a statistical framework for empirical Bayes learning from selectively reported confidence intervals, and apply it to provide context for interpreting results published in MEDLINE abstracts. We use a collection of 326,060 z-scores from MEDLINE abstracts (2000-2018) as the input for an empirical Bayes analysis, with publication bias as a key methodological challenge. We address publication bias through a selective tilting approach that extends empirical Bayes confidence intervals to truncated sampling. Our framework provides coverage guarantees for functionals including posterior estimands describing idealized replications and the symmetrized posterior mean, which we justify decision-theoretically as optimal among sign-equivariant (odd) estimators.}
}
2025
Nature Genetics
Junming Guan, Tan, Tammy, Nehzati, Seyed Moeen, Bennett, Michael, Turley, Patrick, Benjamin, Daniel J., Young, Alexander Strudwick
Abstract
Family-based GWAS can estimate direct genetic effects while reducing confounding. This paper introduces unified and robust estimators that increase effective sample size by incorporating singletons or genetically diverse samples, evaluates their bias-variance tradeoffs, and implements the methods in snipar.
DOI
10.1038/s41588-025-02118-0
BibTeX
@article{Guan2025-vd,
title = {Family-based genome-wide association study designs for increased power and robustness},
author = {Guan, Junming and Tan, Tammy and Nehzati, Seyed Moeen and Bennett, Michael and Turley, Patrick and Benjamin, Daniel J. and Young, Alexander Strudwick},
journal = {Nature Genetics},
volume = {57},
number = {4},
pages = {1044--1052},
year = {2025},
doi = {10.1038/s41588-025-02118-0},
url = {https://www.nature.com/articles/s41588-025-02118-0},
abstract = {Family-based GWAS can estimate direct genetic effects while reducing confounding. This paper introduces unified and robust estimators that increase effective sample size by incorporating singletons or genetically diverse samples, evaluates their bias-variance tradeoffs, and implements the methods in snipar.}
}
2024
medRxiv
Tan, Tammy, Jayashankar, Hariharan, Junming Guan, Nehzati, Seyed Moeen, Mir, Mahdi, Bennett, Michael, Agerbo, Esben, Ahlskog, Rafael, Pinto de Andrade Anapaz, Ville, Asvold, Bjorn Olav, Benonisdottir, Stefania, Bhatta, Laxmi, Boomsma, Dorret I., Brumpton, Ben, Campbell, Archie, Chabris, Christopher F., Cheesman, Rosa, Chen, Zhengming, and others
Abstract
This family-based GWAS meta-analysis studies how gene-environment correlation, population stratification, and assortative mating affect genetic associations. Across 34 phenotypes from 17 cohorts, the analysis compares direct genetic effects with population associations and shows that non-direct components can substantially alter heritability and genetic-correlation estimates.
DOI
10.1101/2024.10.01.24314703
BibTeX
@article{Tan2024-jf,
title = {Family-GWAS reveals effects of environment and mating on genetic associations},
author = {Tan, Tammy and Jayashankar, Hariharan and Guan, Junming and Nehzati, Seyed Moeen and Mir, Mahdi and Bennett, Michael and Agerbo, Esben and Ahlskog, Rafael and Pinto de Andrade Anapaz, Ville and Asvold, Bjorn Olav and Benonisdottir, Stefania and Bhatta, Laxmi and Boomsma, Dorret I. and Brumpton, Ben and Campbell, Archie and Chabris, Christopher F. and Cheesman, Rosa and Chen, Zhengming and others},
journal = {medRxiv},
year = {2024},
doi = {10.1101/2024.10.01.24314703},
url = {https://www.medrxiv.org/content/10.1101/2024.10.01.24314703v1},
abstract = {This family-based GWAS meta-analysis studies how gene-environment correlation, population stratification, and assortative mating affect genetic associations. Across 34 phenotypes from 17 cohorts, the analysis compares direct genetic effects with population associations and shows that non-direct components can substantially alter heritability and genetic-correlation estimates.}
}