Publications

2026 arXiv preprint arXiv:2605.08069

Empirical Bayes Rebiasing

Ling, Wanyi, Li, Sida, Junming Guan, Ignatiadis, Nikolaos

Abstract

We study methods for simultaneous analysis of many noisy and biased estimates, each paired with an even noisier estimate of its own bias. The analyst's goal is to construct short calibrated intervals for each parameter. The standard debiasing approach, which subtracts the bias estimate from each biased estimate, inflates variance and yields long intervals. In this paper, we propose an empirical Bayes rebiasing strategy that starts from the fully debiased estimates and learns from data how much bias to reintroduce by estimating the unknown bias distribution. We provide convergence rates for the coverage of our intervals when the bias distribution is estimated using nonparametric maximum likelihood. Furthermore, we demonstrate substantial precision gains in prediction-powered inference, including pairwise LLM win-rate evaluations, as well as for inference of direct genetic effects in family-based GWAS.

DOI

10.48550/arXiv.2605.08069

BibTeX
@article{Ling2026-eb-rebiasing,
  title = {Empirical Bayes Rebiasing},
  author = {Ling, Wanyi and Li, Sida and Guan, Junming and Ignatiadis, Nikolaos},
  journal = {arXiv preprint arXiv:2605.08069},
  year = {2026},
  doi = {10.48550/arXiv.2605.08069},
  url = {https://arxiv.org/abs/2605.08069},
  abstract = {We study methods for simultaneous analysis of many noisy and biased estimates, each paired with an even noisier estimate of its own bias. The analyst's goal is to construct short calibrated intervals for each parameter. The standard debiasing approach, which subtracts the bias estimate from each biased estimate, inflates variance and yields long intervals. In this paper, we propose an empirical Bayes rebiasing strategy that starts from the fully debiased estimates and learns from data how much bias to reintroduce by estimating the unknown bias distribution. We provide convergence rates for the coverage of our intervals when the bias distribution is estimated using nonparametric maximum likelihood. Furthermore, we demonstrate substantial precision gains in prediction-powered inference, including pairwise LLM win-rate evaluations, as well as for inference of direct genetic effects in family-based GWAS.}
}
2025 arXiv preprint arXiv:2512.13622

Empirical Bayes learning from selectively reported confidence intervals

Chen, Hunter, Junming Guan, van Zwet, Erik, Ignatiadis, Nikolaos

Abstract

We develop a statistical framework for empirical Bayes learning from selectively reported confidence intervals, and apply it to provide context for interpreting results published in MEDLINE abstracts. We use a collection of 326,060 z-scores from MEDLINE abstracts (2000-2018) as the input for an empirical Bayes analysis, with publication bias as a key methodological challenge. We address publication bias through a selective tilting approach that extends empirical Bayes confidence intervals to truncated sampling. Our framework provides coverage guarantees for functionals including posterior estimands describing idealized replications and the symmetrized posterior mean, which we justify decision-theoretically as optimal among sign-equivariant (odd) estimators.

DOI

10.48550/arXiv.2512.13622

BibTeX
@article{Chen2025-selective-ci,
  title = {Empirical Bayes learning from selectively reported confidence intervals},
  author = {Chen, Hunter and Guan, Junming and van Zwet, Erik and Ignatiadis, Nikolaos},
  journal = {arXiv preprint arXiv:2512.13622},
  year = {2025},
  doi = {10.48550/arXiv.2512.13622},
  url = {https://arxiv.org/abs/2512.13622},
  abstract = {We develop a statistical framework for empirical Bayes learning from selectively reported confidence intervals, and apply it to provide context for interpreting results published in MEDLINE abstracts. We use a collection of 326,060 z-scores from MEDLINE abstracts (2000-2018) as the input for an empirical Bayes analysis, with publication bias as a key methodological challenge. We address publication bias through a selective tilting approach that extends empirical Bayes confidence intervals to truncated sampling. Our framework provides coverage guarantees for functionals including posterior estimands describing idealized replications and the symmetrized posterior mean, which we justify decision-theoretically as optimal among sign-equivariant (odd) estimators.}
}
2025 Nature Genetics

Family-based genome-wide association study designs for increased power and robustness

Junming Guan, Tan, Tammy, Nehzati, Seyed Moeen, Bennett, Michael, Turley, Patrick, Benjamin, Daniel J., Young, Alexander Strudwick

Abstract

Family-based GWAS can estimate direct genetic effects while reducing confounding. This paper introduces unified and robust estimators that increase effective sample size by incorporating singletons or genetically diverse samples, evaluates their bias-variance tradeoffs, and implements the methods in snipar.

DOI

10.1038/s41588-025-02118-0

BibTeX
@article{Guan2025-vd,
  title = {Family-based genome-wide association study designs for increased power and robustness},
  author = {Guan, Junming and Tan, Tammy and Nehzati, Seyed Moeen and Bennett, Michael and Turley, Patrick and Benjamin, Daniel J. and Young, Alexander Strudwick},
  journal = {Nature Genetics},
  volume = {57},
  number = {4},
  pages = {1044--1052},
  year = {2025},
  doi = {10.1038/s41588-025-02118-0},
  url = {https://www.nature.com/articles/s41588-025-02118-0},
  abstract = {Family-based GWAS can estimate direct genetic effects while reducing confounding. This paper introduces unified and robust estimators that increase effective sample size by incorporating singletons or genetically diverse samples, evaluates their bias-variance tradeoffs, and implements the methods in snipar.}
}
2024 medRxiv

Family-GWAS reveals effects of environment and mating on genetic associations

Tan, Tammy, Jayashankar, Hariharan, Junming Guan, Nehzati, Seyed Moeen, Mir, Mahdi, Bennett, Michael, Agerbo, Esben, Ahlskog, Rafael, Pinto de Andrade Anapaz, Ville, Asvold, Bjorn Olav, Benonisdottir, Stefania, Bhatta, Laxmi, Boomsma, Dorret I., Brumpton, Ben, Campbell, Archie, Chabris, Christopher F., Cheesman, Rosa, Chen, Zhengming, and others

Abstract

This family-based GWAS meta-analysis studies how gene-environment correlation, population stratification, and assortative mating affect genetic associations. Across 34 phenotypes from 17 cohorts, the analysis compares direct genetic effects with population associations and shows that non-direct components can substantially alter heritability and genetic-correlation estimates.

DOI

10.1101/2024.10.01.24314703

BibTeX
@article{Tan2024-jf,
  title = {Family-GWAS reveals effects of environment and mating on genetic associations},
  author = {Tan, Tammy and Jayashankar, Hariharan and Guan, Junming and Nehzati, Seyed Moeen and Mir, Mahdi and Bennett, Michael and Agerbo, Esben and Ahlskog, Rafael and Pinto de Andrade Anapaz, Ville and Asvold, Bjorn Olav and Benonisdottir, Stefania and Bhatta, Laxmi and Boomsma, Dorret I. and Brumpton, Ben and Campbell, Archie and Chabris, Christopher F. and Cheesman, Rosa and Chen, Zhengming and others},
  journal = {medRxiv},
  year = {2024},
  doi = {10.1101/2024.10.01.24314703},
  url = {https://www.medrxiv.org/content/10.1101/2024.10.01.24314703v1},
  abstract = {This family-based GWAS meta-analysis studies how gene-environment correlation, population stratification, and assortative mating affect genetic associations. Across 34 phenotypes from 17 cohorts, the analysis compares direct genetic effects with population associations and shows that non-direct components can substantially alter heritability and genetic-correlation estimates.}
}