Strainy: phasing and assembly of strain haplotypes from long-read metagenome sequencing

Zhao, S. et al. Adaptive evolution within gut microbiomes of healthy people. Cell Host Microbe 25, 656–667 (2019).CAS 

Google Scholar 
Kaper, J. B., Nataro, J. P. & Mobley, H. L. Pathogenic Escherichia coli. Nat. Rev. Microbiol. 2, 123–140 (2004).Article 
CAS 
PubMed 

Google Scholar 
Schloissnig, S. et al. Genomic variation landscape of the human gut microbiome. Nature 493, 45–50 (2013).Article 
PubMed 

Google Scholar 
Good, B. H., McDonald, M. J., Barrick, J. E., Lenski, R. E. & Desai, M. M. The dynamics of molecular evolution over 60,000 generations. Nature 551, 45–50 (2017).Article 
PubMed 
PubMed Central 

Google Scholar 
Yan, Y., Nguyen, L. H., Franzosa, E. A. & Huttenhower, C. Strain-level epidemiology of microbial communities and the human microbiome. Genome Med. 12, 71 (2020).Article 
PubMed 
PubMed Central 

Google Scholar 
Zimmermann, M., Zimmermann-Kogadeeva, M., Wegmann, R. & Goodman, A. L. Mapping human microbiome drug metabolism by gut bacteria and their genes. Nature 570, 462–467 (2019).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Albanese, D. & Donati, C. Strain profiling and epidemiology of bacterial species from metagenomic sequencing. Nat. Commun. 8, 2260 (2017).Article 
PubMed 
PubMed Central 

Google Scholar 
Olm, M. R. et al. inStrain profiles population microdiversity from metagenomic data and sensitively detects shared microbial strains. Nat. Biotechnol. 39, 727–736 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Quince, C. et al. STRONG: metagenomics strain resolution on assembly graphs. Genome Biol. 22, 214 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Ghurye, J. et al. MetaCarvel: linking assembly graph motifs to biological variants. Genome Biol. 20, 174 (2019).Article 
PubMed 
PubMed Central 

Google Scholar 
Li, D., Liu, C.-M., Luo, R., Sadakane, K. & Lam, T.-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).Article 
CAS 
PubMed 

Google Scholar 
Nurk, S., Meleshko, D., Korobeynikov, A. & Pevzner, P. A. metaSPAdes: a new versatile metagenomic assembler. Genome Res. 27, 824–834 (2017).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Bertrand, D. et al. Hybrid metagenomic assembly enables high-resolution analysis of resistance determinants and mobile elements in human microbiomes. Nat. Biotechnol. 37, 937–944 (2019).Article 
CAS 
PubMed 

Google Scholar 
Kim, C. Y., Ma, J. & Lee, I. HiFi metagenomic sequencing enables assembly of accurate and complete genomes from human gut microbiota. Nat. Commun. 13, 6367 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Dai, D. et al. Long-read metagenomic sequencing reveals shifts in associations of antibiotic resistance genes with mobile genetic elements from sewage to activated sludge. Microbiome 10, 20 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Beaulaurier, J. et al. Assembly-free single-molecule sequencing recovers complete virus genomes from natural microbial communities. Genome Res. 30, 437–446 (2020).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Van Goethem, M. W. et al. Long-read metagenomics of soil communities reveals phylum-specific secondary metabolite dynamics. Commun. Biol. 4, 1302 (2021).Article 
PubMed 
PubMed Central 

Google Scholar 
Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27, 722–736 (2017).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Kolmogorov, M., Yuan, J., Lin, Y. & Pevzner, P. A. Assembly of long, error-prone reads using repeat graphs. Nat. Biotechnol. 37, 540–546 (2019).Article 
CAS 
PubMed 

Google Scholar 
Shafin, K. et al. Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes. Nat. Biotechnol. 38, 1044–1053 (2020).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Kolmogorov, M. et al. metaFlye: scalable long-read metagenome assembly using repeat graphs. Nat. Methods 17, 1103–1110 (2020).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Bickhart, D. M. et al. Generating lineage-resolved, complete metagenome-assembled genomes from complex microbial communities. Nat. Biotechnol. 40, 711–719 (2022).Article 
CAS 
PubMed 

Google Scholar 
Meyer, F. et al. Critical assessment of metagenome interpretation: the second round of challenges. Nat. Methods 19, 429–440 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Curry, K. D. et al. Reference-free structural variant detection in microbiomes via long-read coassembly graphs. Bioinformatics 40, i58–i67 (2024).Article 
PubMed 
PubMed Central 

Google Scholar 
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Nurk, S. et al. HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads. Genome Res. 30, 1291–1305 (2020).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Rautiainen, M. et al. Telomere-to-telomere assembly of diploid chromosomes with Verkko. Nat. Biotechnol. 41, 1474–1482 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Feng, X., Cheng, H., Portik, D. & Li, H. Metagenome assembly of high-fidelity long reads with hifiasm-meta. Nat. Methods 19, 671–674 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Benoit, G. et al. High-quality metagenome assembly from long accurate reads with metaMDBG. Nat. Biotechnol. 42, 1378–1383 (2024).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Fedarko, M. W., Kolmogorov, M. & Pevzner, P. A. Analyzing rare mutations in metagenomes assembled using long and accurate reads. Genome Res. 32, 2119–2133 (2022).Article 
PubMed 
PubMed Central 

Google Scholar 
Kolmogorov, M. et al. Scalable Nanopore sequencing of human genomes provides a comprehensive view of haplotype-resolved variation and methylation. Nat. Methods 20, 1483–1492 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Chen, L. et al. Short- and long-read metagenomics expand individualized structural variations in gut microbiomes. Nat. Commun. 13, 3175 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Jin, H. et al. A high-quality genome compendium of the human gut microbiome of Inner Mongolians. Nat. Microbiol. 8, 150–161 (2023).Article 
CAS 
PubMed 

Google Scholar 
Martin, M. et al. WhatsHap: fast and accurate read-based phasing. Preprint at bioRxiv https://doi.org/10.1101/085050 (2016).Edge, P., Bafna, V. & Bansal, V. HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies. Genome Res. 27, 801–812 (2017).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Shafin, K. et al. Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads. Nat. Methods 18, 1322–1332 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Schrinner, S. D. et al. Haplotype threading: accurate polyploid phasing from long reads. Genome Biol. 21, 252 (2020).Article 
PubMed 
PubMed Central 

Google Scholar 
Chin, C.-S. et al. Phased diploid genome assembly with single-molecule real-time sequencing. Nat. Methods 13, 1050–1054 (2016).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Garg, S. et al. A haplotype-aware de novo assembly of related individuals using pedigree sequence graph. Bioinformatics 36, 2385–2392 (2020).Article 
CAS 
PubMed 

Google Scholar 
Faure, R., Guiglielmoni, N. & Flot, J.-F. GraphUnzip: unzipping assembly graphs with long reads and Hi-C. Preprint at bioRxiv https://doi.org/10.1101/2021.01.29.428779 (2021).Nicholls, S. M. et al. On the complexity of haplotyping a microbial community. Bioinformatics 37, 1360–1366 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Vicedomini, R., Quince, C., Darling, A. E. & Chikhi, R. Strainberry: automated strain separation in low-complexity metagenomes using long reads. Nat. Commun. 12, 4485 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Feng, Z., Clemente, J. C., Wong, B. & Schadt, E. E. Detecting and phasing minor single-nucleotide variants from long-read sequencing data. Nat. Commun. 12, 3032 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Knyazev, S., Hughes, L., Skums, P. & Zelikovsky, A. Epidemiological data analysis of viral quasispecies in the next-generation sequencing era. Brief. Bioinform. 22, 96–108 (2021).Article 
CAS 

Google Scholar 
Jablonski, K. P. & Beerenwinkel, N. in Virus Bioinformatics 51–64 (Chapman and Hall/CRC, 2021).Warwick-Dugdale, J. et al. Long-read viral metagenomics captures abundant and microdiverse viral populations and their niche-defining genomic islands. PeerJ 7, e6800 (2019).Article 
PubMed 
PubMed Central 

Google Scholar 
Zhou, Z., Luhmann, N., Alikhan, N.-F., Quince, C. & Achtman, M. Accurate reconstruction of microbial strains from metagenomic sequencing using representative reference genomes. In Research in Computational Molecular Biology 225–240 (Springer, 2018).Liu, L., Yang, Y., Deng, Y. & Zhang, T. Nanopore long-read-only metagenomics enables complete and high-quality genome reconstruction from mock and complex metagenomes. Microbiome 10, 209 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Luo, X., Kang, X. & Schönhuth, A. VeChat: correcting errors in long reads using variation graphs. Nat. Commun. 13, 6657 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Mikheenko, A., Saveliev, V. & Gurevich, A. MetaQUAST: evaluation of metagenome assemblies. Bioinformatics 32, 1088–1090 (2015).Article 
PubMed 

Google Scholar 
Shaw, J. & Yu, Y. W. Fast and robust metagenomic sequence comparison through sparse chaining with skani. Nat. Methods 20, 1661–1665 (2023).Wick, R. R., Schultz, M. B., Zobel, J. & Holt, K. E. Bandage: interactive visualization of de novo genome assemblies. Bioinformatics 31, 3350–3352 (2015).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Sereika, M. et al. Oxford Nanopore R10.4 long-read sequencing enables the generation of near-finished bacterial genomes from pure cultures and metagenomes without short-read or reference polishing. Nat. Methods 19, 823–826 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Zheng, Z. et al. Symphonizing pileup and full-alignment for deep learning-based long-read variant calling. Nat. Comput. Sci. 2, 797–803 (2022).
Google Scholar 
Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7, e7359 (2019).Article 
PubMed 
PubMed Central 

Google Scholar 
Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10, giab008 (2021).Article 
PubMed 
PubMed Central 

Google Scholar 
Jee, J. et al. Rates and mechanisms of bacterial mutagenesis from maximum-depth sequencing. Nature 534, 693–696 (2016).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Huang, H. et al. Tigecycline resistance-associated mutations in the MepA efflux pump in Staphylococcus aureus. Microbiol. Spectr. 11, e0063423 (2023).Article 
PubMed 

Google Scholar 
Jagdmann, J., Andersson, D. I. & Nicoloff, H. Low levels of tetracyclines select for a mutation that prevents the evolution of high-level resistance to tigecycline. PLoS Biol. 20, e3001808 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Raghavan, U. N., Albert, R. & Kumara, S. Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 76, 036106 (2007).Article 

Google Scholar 
Kazantseva, E., Donmez, A. & Kolmogorov, M. Strainy: phasing and assembly of strain haplotypes from long-read metagenome sequencing—real and mock datasets. Zenodo https://doi.org/10.5281/zenodo.11149518 (2024).Kazantseva, E., Donmez, A. & Kolmogorov, M. Strainy: phasing and assembly of strain haplotypes from long-read metagenome sequencing—simulated datasets. Zenodo https://doi.org/10.5281/zenodo.11142288 (2024).

Hot Topics

Related Articles