Inference and applications of ancestral recombination graphs

Kingman, J. F. C. On the genealogy of large populations. J. Appl. Probab. 19, 27–43 (1982). This paper rigorously derives the standard coalescence process, now known as the Kingman coalescent, and shows that the stochastic process of lines of descent of a population genetic sample converges to a strictly binary tree with exponentially distributed waiting times between coalescence events.Article 

Google Scholar 
Hudson, R. R. Properties of a neutral allele model with intragenic recombination. Theor. Popul. Biol. 23, 183–201 (1983). This paper describes the CwR and the resulting genealogical structure of ARGs (although it does not use that term).Article 
CAS 
PubMed 

Google Scholar 
Fu, Y. X. & Li, W. H. Coalescing into the 21st century: an overview and prospects of coalescent theory. Theor. Popul. Biol. 56, 1–10 (1999).Article 
CAS 
PubMed 

Google Scholar 
Rosenberg, N. A. & Nordborg, M. Genealogical trees, coalescent theory and the analysis of genetic polymorphisms. Nat. Rev. Genet. 3, 380–390 (2002).Article 
CAS 
PubMed 

Google Scholar 
Wakeley, J. Developments in coalescent theory from single loci to chromosomes. Theor. Popul. Biol. 133, 56–64 (2020).Article 
PubMed 

Google Scholar 
Hudson, R. R. Testing the constant-rate neutral allele model with protein sequence data. Evolution 37, 203–217 (1983).Article 
PubMed 

Google Scholar 
Slatkin, M. & Hudson, R. R. Pairwise comparisons of mitochondrial DNA sequences in stable and exponentially growing populations. Genetics 129, 555–562 (1991).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Hudson, R. R., Slatkin, M. & Maddison, W. P. Estimation of levels of gene flow from DNA sequence data. Genetics 132, 583–589 (1992).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Beerli, P. & Felsenstein, J. Maximum likelihood estimation of a migration matrix and effective population sizes in n subpopulations by using a coalescent approach. Proc. Natl Acad. Sci. USA 98, 4563–4568 (2001).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Nielsen, R. & Wakeley, J. Distinguishing migration from isolation: a Markov chain Monte Carlo approach. Genetics 158, 885–896 (2001).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Kaplan, N. L., Hudson, R. R. & Langley, C. H. The “hitchhiking effect” revisited. Genetics 123, 887–899 (1989). This paper derives coalescence models for neutral loci linked to a locus under selection.Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Nielsen, R. et al. Genomic scans for selective sweeps using SNP data. Genome Res. 15, 1566–1575 (2005).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Griffiths, R. C. & Tavaré, S. Ancestral inference in population genetics. Stat. Sci. 9, 307–319 (1994).Article 

Google Scholar 
Wilson, I. J. & Balding, D. J. Genealogical inference from microsatellite data. Genetics 150, 499–510 (1998).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Hey, J. The divergence of chimpanzee species and subspecies as revealed in multipopulation isolation-with-migration analyses. Mol. Biol. Evol. 27, 921–933 (2010).Article 
CAS 
PubMed 

Google Scholar 
Gronau, I., Hubisz, M. J., Gulko, B., Danko, C. G. & Siepel, A. Bayesian inference of ancient human demography from individual genome sequences. Nat. Genet. 43, 1031–1034 (2011).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Myers, S., Bottolo, L., Freeman, C., McVean, G. & Donnelly, P. A fine-scale map of recombination rates and hotspots across the human genome. Science 310, 321–324 (2005).Article 
CAS 
PubMed 

Google Scholar 
Scally, A. & Durbin, R. Revising the human mutation rate: implications for understanding human evolution. Nat. Rev. Genet. 13, 745–753 (2012).Article 
CAS 
PubMed 

Google Scholar 
Nielsen, R. Estimation of population parameters and recombination rates from single nucleotide polymorphisms. Genetics 154, 931–942 (2000).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Adams, A. M. & Hudson, R. R. Maximum-likelihood estimation of demographic parameters using the frequency spectrum of unlinked single-nucleotide polymorphisms. Genetics 168, 1699–1712 (2004).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Garrigan, D. Composite likelihood estimation of demographic parameters. BMC Genet. 10, 72 (2009).Article 
PubMed 
PubMed Central 

Google Scholar 
Nielsen, R. et al. Darwinian and demographic forces affecting human protein coding genes. Genome Res. 19, 838–849 (2009).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Gutenkunst, R. N., Hernandez, R. D., Williamson, S. H. & Bustamante, C. D. Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Genet. 5, e1000695 (2009).Article 
PubMed 
PubMed Central 

Google Scholar 
Excoffier, L., Dupanloup, I., Huerta-Sánchez, E., Sousa, V. C. & Foll, M. Robust demographic inference from genomic and SNP data. PLoS Genet. 9, e1003905 (2013).Article 
PubMed 
PubMed Central 

Google Scholar 
Beaumont, M. A., Zhang, W. & Balding, D. J. Approximate Bayesian computation in population genetics. Genetics 162, 2025–2035 (2002).Article 
PubMed 
PubMed Central 

Google Scholar 
Li, H. & Durbin, R. Inference of human population history from individual whole-genome sequences. Nature 475, 493–496 (2011).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Rasmussen, M. D., Hubisz, M. J., Gronau, I. & Siepel, A. Genome-wide inference of ancestral recombination graphs. PLoS Genet. 10, e1004342 (2014). This paper describes the first method for full probabilistic inferences of ARGs (ARGweaver).Article 
PubMed 
PubMed Central 

Google Scholar 
Griffiths, R. C. & Marjoram, P. An ancestral recombination graph. Inst. Math. Appl. 87, 257 (1997). This paper coins the term ARG and provides a rigorous derivation of the CwR.CAS 

Google Scholar 
Wiuf, C. & Hein, J. Recombination as a point process along sequences. Theor. Popul. Biol. 55, 248–259 (1999).Article 
CAS 
PubMed 

Google Scholar 
McVean, G. A. T. & Cardin, N. J. Approximating the coalescent with recombination. Phil. Trans. R. Soc. B 360, 1387–1393 (2005).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Marjoram, P. & Wall, J. D. Fast “coalescent” simulation. BMC Genet. 7, 16 (2006).Article 
PubMed 
PubMed Central 

Google Scholar 
Wilton, P. R., Carmi, S. & Hobolth, A. The SMC’ is a highly accurate approximation to the ancestral recombination graph. Genetics 200, 343–355 (2015).Article 
PubMed 
PubMed Central 

Google Scholar 
Wong, Y. et al. A general and efficient representation of ancestral recombination graphs. Genetics 228, iyae100 (2024).Article 
PubMed 
PubMed Central 

Google Scholar 
Minichiello, M. J. & Durbin, R. Mapping trait loci by use of inferred ancestral recombination graphs. Am. J. Hum. Genet. 79, 910–922 (2006).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Speidel, L., Forest, M., Shi, S. & Myers, S. R. A method for genome-wide genealogy estimation for thousands of samples. Nat. Genet. 51, 1321–1329 (2019). This paper presents the popular ARG inference method Relate.Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Mirzaei, S. & Wu, Y. RENT+: an improved method for inferring local genealogical trees from haplotypes with recombination. Bioinformatics 33, 1021–1030 (2017).Article 
CAS 
PubMed 

Google Scholar 
Heine, K., Beskos, A., Jasra, A., Balding, D. & De Iorio, M. Bridging trees for posterior inference on ancestral recombination graphs. Proc. R. Soc. A. 474, 20180568 (2018).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Kelleher, J. et al. Inferring whole-genome histories in large population datasets. Nat. Genet. 51, 1330–1338 (2019). This paper presents the popular ARG inference method tsinfer, which is applicable to biobank-scale data.Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Wohns, A. W. et al. A unified genealogy of modern and ancient genomes. Science 375, eabi8264 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Hubisz, M. J., Williams, A. L. & Siepel, A. Mapping gene flow between ancient hominins through demography-aware inference of the ancestral recombination graph. PLoS Genet. 16, e1008895 (2020).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Schaefer, N. K., Shapiro, B. & Green, R. E. An ancestral recombination graph of human, Neanderthal, and Denisovan genomes. Sci. Adv. 7, eabc0776 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Ignatieva, A., Lyngsø, R. B., Jenkins, P. A. & Hein, J. KwARG: parsimonious reconstruction of ancestral recombination graphs with recurrent mutation. Bioinformatics 37, 3277–3284 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Mahmoudi, A., Koskela, J., Kelleher, J., Chan, Y.-B. & Balding, D. Bayesian inference of ancestral recombination graphs. PLoS Comput. Biol. 18, e1009960 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Zhang, B. C., Biddanda, A., Gunnarsson, Á. F., Cooper, F. & Palamara, P. F. Biobank-scale inference of ancestral recombination graphs enables genealogical analysis of complex traits. Nat. Genet. 55, 768–776 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Deng, Y., Nielsen, R. & Song, Y. S. Robust and accurate bayesian inference of genome-wide genealogies for large samples. Preprint at bioRxiv https://doi.org/10.1101/2024.03.16.585351 (2024).Hudson, R. R. Generating samples under a Wright–Fisher neutral model of genetic variation. Bioinformatics 18, 337–338 (2002).Article 
CAS 
PubMed 

Google Scholar 
Excoffier, L. & Foll, M. Fastsimcoal: a continuous-time coalescent simulator of genomic diversity under arbitrarily complex evolutionary scenarios. Bioinformatics 27, 1332–1334 (2011).Article 
CAS 
PubMed 

Google Scholar 
Kelleher, J., Etheridge, A. M. & McVean, G. Efficient coalescent simulation and genealogical analysis for large sample sizes. PLoS Comput. Biol. 12, e1004842 (2016).Article 
PubMed 
PubMed Central 

Google Scholar 
Kelleher, J., Thornton, K. R., Ashander, J. & Ralph, P. L. Efficient pedigree recording for fast population genetics simulation. PLoS Comput. Biol. 14, e1006581 (2018).Article 
PubMed 
PubMed Central 

Google Scholar 
Baumdicker, F. et al. Efficient ancestry and mutation simulation with msprime 1.0. Genetics 220, iyab229 (2022).Article 
PubMed 

Google Scholar 
Y. C. Brandt, D., Wei, X., Deng, Y., Vaughn, A. H. & Nielsen, R. Evaluation of methods for estimating coalescence times using ancestral recombination graphs. Genetics 221, iyac044 (2022).Article 
PubMed 
PubMed Central 

Google Scholar 
Peng, D., Mulder, O. J. & Edge, M. D. Evaluating ARG-estimation methods in the context of estimating population-mean polygenic score histories. Preprint at bioRxiv https://doi.org/10.1101/2024.05.24.595829 (2024).Deng, Y., Song, Y. S. & Nielsen, R. The distribution of waiting distances in ancestral recombination graphs. Theor. Popul. Biol. 141, 34–43 (2021).Article 
PubMed 
PubMed Central 

Google Scholar 
1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).Article 

Google Scholar 
Sudlow, C. et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).Article 
PubMed 
PubMed Central 

Google Scholar 
Fan, C. et al. A likelihood-based framework for demographic inference from genealogical trees. Preprint at bioRxiv https://doi.org/10.1101/2023.10.10.561787 (2023).Pearson, A. & Durbin, R. Local ancestry inference for complex population histories. Preprint at bioRxiv https://doi.org/10.1101/2023.03.06.529121 (2023).Irving-Pease, E. K. et al. The selection landscape and genetic legacy of ancient Eurasians. Nature 625, 312–320 (2024).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Coop, G. & Griffiths, R. C. Ancestral inference on gene trees under selection. Theor. Popul. Biol. 66, 219–232 (2004).Article 
PubMed 

Google Scholar 
Hejase, H. A., Dukler, N. & Siepel, A. From summary statistics to gene trees: methods for inferring positive selection. Trends Genet. 36, 243–258 (2020).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Stern, A. J., Wilton, P. R. & Nielsen, R. An approximate full-likelihood method for inferring selection and allele frequency trajectories from DNA sequence data. PLoS Genet. 15, e1008384 (2019). This paper demonstrates how ARGs can be used to infer selection.Article 
PubMed 
PubMed Central 

Google Scholar 
Vaughn, A. H. & Nielsen, R. Fast and accurate estimation of selection coefficients and allele histories from ancient and modern DNA. Mol. Biol. Evol. 41, msae156 (2024).Article 
PubMed 
PubMed Central 

Google Scholar 
Hejase, H. A., Mo, Z., Campagna, L. & Siepel, A. A deep-learning approach for inference of selective sweeps from the ancestral recombination graph. Mol. Biol. Evol. 39, msab332 (2022).Article 
CAS 
PubMed 

Google Scholar 
Mo, Z. & Siepel, A. Domain-adaptive neural networks improve supervised machine learning based on simulated population genetic data. PLoS Genet. 19, e1011032 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Stern, A. J., Speidel, L., Zaitlen, N. A. & Nielsen, R. Disentangling selection on genetically correlated polygenic traits via whole-genome genealogies. Am. J. Hum. Genet. 108, 219–239 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Edge, M. D. & Coop, G. Reconstructing the history of polygenic scores using coalescent trees. Genetics 211, 235–262 (2019).Article 
PubMed 

Google Scholar 
Osmond, M. M. & Coop, G. Estimating dispersal rates and locating genetic ancestors with genome-wide genealogies. Preprint at bioRxiv https://doi.org/10.1101/2021.07.13.452277 (2021).Grundler, M. C., Terhorst, J. & Bradburd, G. S. A geographic history of human genetic ancestry. Preprint at bioRxiv https://doi.org/10.1101/2024.03.27.586858 (2024).Deraje, P., Kitchens, J., Coop, G. & Osmond, M. M. Inferring the geographic history of recombinant lineages using the full ancestral recombination graph. Preprint at bioRxiv https://doi.org/10.1101/2024.04.10.588900 (2024).Gao, Z., Zhang, Y., Cramer, N., Przeworski, M. & Moorjani, P. Limited role of generation time changes in driving the evolution of the mutation spectrum in humans. eLife 12, e81188 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Albers, P. K. & McVean, G. Dating genomic variants and shared ancestry in population-scale sequencing data. PLoS Biol. 18, e3000586 (2020).Article 
PubMed 
PubMed Central 

Google Scholar 
Wang, R. J., Al-Saffar, S. I., Rogers, J. & Hahn, M. W. Human generation times across the past 250,000 years. Sci. Adv. 9, eabm7047 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Ragsdale, A. P. & Thornton, K. R. Multiple sources of uncertainty confound inference of historical human generation times. Mol. Biol. Evol. 40, msad160 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Huang, Z., Kelleher, J., Chan, Y.-B. & Balding, D. J. Estimating evolutionary and demographic parameters via ARG-derived IBD. Preprint at bioRxiv https://doi.org/10.1101/2024.03.07.583855 (2024).Ignatieva, A., Favero, M., Koskela, J., Sant, J. & Myers, S. R. The distribution of branch duration and detection of inversions in ancestral recombination graphs. Preprint at bioRxiv https://doi.org/10.1101/2023.07.11.548567 (2023).Speidel, L. et al. High-resolution genomic ancestry reveals mobility in early medieval Europe. Preprint at bioRxiv https://doi.org/10.1101/2024.03.15.585102 (2024).Tagami, D., Bisschop, G. & Kelleher, J. tstrait: a quantitative trait simulator for ancestral recombination graphs. Preprint at bioRxiv https://doi.org/10.1101/2024.03.13.584790 (2024).Link, V. et al. Tree-based QTL mapping with expected local genetic relatedness matrices. Am. J. Hum. Genet. 110, 2077–2091 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Salehi Nowbandegani, P. et al. Extremely sparse models of linkage disequilibrium in ancestrally diverse association studies. Nat. Genet. 55, 1494–1502 (2023).Article 
CAS 
PubMed 

Google Scholar 
Tsambos, G., Kelleher, J., Ralph, P., Leslie, S. & Vukcevic, D. link-ancestors: fast simulation of local ancestry with tree sequence software. Bioinform. Adv. 3, vbad163 (2023).Article 
PubMed 
PubMed Central 

Google Scholar 
Haller, B. C. & Messer, P. W. SLiM 3: forward genetic simulations beyond the Wright–Fisher model. Mol. Biol. Evol. 36, 632–637 (2019).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Tokdar, S. T. & Kass, R. E. Importance sampling: a review. Wiley Interdiscip. Rev. Comput. Stat. 2, 54–60 (2010).Article 

Google Scholar 
Hammersley, J. M. & Morton, K. W. Poor man’s Monte Carlo. J. R. Stat. Soc. Ser. B Stat. Methodol. 16, 23–38 (1954).Article 

Google Scholar 
Rosenbluth, M. N. & Rosenbluth, A. W. Monte Carlo calculation of the average extension of molecular chains. J. Chem. Phys. 23, 356–359 (1955).Article 
CAS 

Google Scholar 
Kuhner, M. K. LAMARC 2.0: maximum likelihood and Bayesian estimation of population parameters. Bioinformatics 22, 768–770 (2006).Article 
CAS 
PubMed 

Google Scholar 
Wang, Y. & Rannala, B. Bayesian inference of fine-scale recombination rates using population genomic data. Phil. Trans. R. Soc. B 363, 3921–3930 (2008).Article 
PubMed 
PubMed Central 

Google Scholar 
Wang, Y. & Rannala, B. Population genomic inference of recombination rates and hotspots. Proc. Natl Acad. Sci. USA 106, 6215–6219 (2009).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Vaughan, T. G. et al. Inferring ancestral recombination graphs from bacterial genomic data. Genetics 205, 857–870 (2017).Article 
PubMed 

Google Scholar 
Ségurel, L. et al. The ABO blood group is a trans-species polymorphism in primates. Proc. Natl Acad. Sci. USA 109, 18493–18498 (2012).Article 
PubMed 
PubMed Central 

Google Scholar 
Enattah, N. S. et al. Identification of a variant associated with adult-type hypolactasia. Nat. Genet. 30, 233–237 (2002).Article 
CAS 
PubMed 

Google Scholar 
Bersaglieri, T. et al. Genetic signatures of strong recent positive selection at the lactase gene. Am. J. Hum. Genet. 74, 1111–1120 (2004).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Chin, E. L. et al. Association of lactase persistence genotypes (rs4988235) and ethnicity with dairy intake in a healthy U.S. population. Nutrients 11, 1860 (2019).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Fortier, A. L. & Pritchard, J. K. Ancient trans-species polymorphism at the major histocompatibility complex in primates. Preprint at bioRxiv https://doi.org/10.1101/2022.06.28.497781 (2022).Azevedo, L., Serrano, C., Amorim, A. & Cooper, D. N. Trans-species polymorphism in humans and the great apes is generally maintained by balancing selection that modulates the host immune response. Hum. Genomics 9, 21 (2015).Article 
PubMed 
PubMed Central 

Google Scholar 
Byrska-Bishop, M. et al. High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios. Cell 185, 3426–3440.e19 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Patterson, N., Richter, D. J., Gnerre, S., Lander, E. S. & Reich, D. Genetic evidence for complex speciation of humans and chimpanzees. Nature 441, 1103–1108 (2006).Article 
CAS 
PubMed 

Google Scholar 
Mallick, S. et al. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature 538, 201–206 (2016).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Amos, W. & Hoffman, J. I. Evidence that two main bottleneck events shaped modern human genetic diversity. Proc. Biol. Sci. 277, 131–137 (2010).CAS 
PubMed 

Google Scholar 
Kittles, R. A. et al. Dual origins of Finns revealed by Y chromosome haplotype variation. Am. J. Hum. Genet. 62, 1171–1179 (1998).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 

Hot Topics

Related Articles