Deep learning prediction of glycopeptide tandem mass spectra powers glycoproteomics

Wang, Y. C., Peterson, S. E. & Loring, J. F. Protein post-translational modifications and regulation of pluripotency in human stem cells. Cell Res. 24, 143–160 (2014).Article 

Google Scholar 
Hart, G. W. & Copeland, R. J. Glycomics hits the big time. Cell 143, 672–676 (2010).Article 

Google Scholar 
Hu, H., Khatri, K. & Zaia, J. Algorithms and design strategies towards automated glycoproteomics analysis. Mass Spectrom. Rev. 36, 475–498 (2017).Article 

Google Scholar 
Hu, H., Khatri, K., Klein, J., Leymarie, N. & Zaia, J. A review of methods for interpretation of glycopeptide tandem mass spectral data. Glycoconj. J. 33, 285–296 (2016).Article 

Google Scholar 
Bojar, D. & Lisacek, F. Glycoinformatics in the artificial intelligence era. Chem. Rev. 122, 15971–15988 (2022).Article 

Google Scholar 
Zeng, W. F. et al. pGlyco: a pipeline for the identification of intact N-glycopeptides by using HCD- and CID-MS/MS and MS3. Sci. Rep. 6, 25102 (2016).Article 

Google Scholar 
Liu, M. Q. et al. pGlyco 2.0 enables precision N-glycoproteomics with comprehensive quality control and one-step mass spectrometry for intact glycopeptide identification. Nat. Commun. 8, 438 (2017).Article 

Google Scholar 
Zeng, W. F., Cao, W. Q., Liu, M. Q., He, S. M. & Yang, P. Y. Precise, fast and comprehensive analysis of intact glycopeptides and modified glycans with pGlyco3. Nat. Methods 18, 1515–1523 (2021).Article 

Google Scholar 
Shen, J. C. et al. StrucGP: de novo structural sequencing of site-specific N-glycan on glycoproteins using a modularization strategy. Nat. Methods 18, 921–929 (2021).Article 

Google Scholar 
Polasky, D. A., Yu, F. C., Teo, G. C. & Nesvizhskii, A. I. Fast and comprehensive N- and O-glycoproteomics analysis with MSFragger-Glyco. Nat. Methods 17, 1125–1132 (2020).Article 

Google Scholar 
Lu, L., Riley, N. M., Shortreed, M. R., Bertozzi, C. R. & Smith, L. M. O-Pair search with MetaMorpheus for O-glycopeptide characterization. Nat. Methods 17, 1133–1138 (2020).Article 

Google Scholar 
Medzihradszky, K. F., Maynard, J., Kaasik, K. & Bern, M. Intact N- and O-linked glycopeptide identification from HCD data using Byonic. Mol. Cell. Proteomics 13, S36 (2014).
Google Scholar 
Fang, Z. et al. Glyco-Decipher enables glycan database-independent peptide matching and in-depth characterization of site-specific N-glycosylation. Nat. Commun. 13, 1900 (2022).Article 

Google Scholar 
Xiao, K. & Tian, Z. GPSeeker enables quantitative structural N-Glycoproteomics for site- and structure-specific characterization of differentially expressed N-glycosylation in hepatocellular carcinoma. J. Proteome Res. 18, 2885–2895 (2019).Article 

Google Scholar 
Peng, W. et al. MS-based glycomics and glycoproteomics methods enabling isomeric characterization. Mass Spectrom. Rev. 42, 577–616 (2023).Article 

Google Scholar 
Toghi Eshghi, S., Shah, P., Yang, W., Li, X. & Zhang, H. GPQuest: a spectral library matching algorithm for site-specific assignment of tandem mass spectra to intact N-glycopeptides. Anal. Chem. 87, 5181–5188 (2015).Article 

Google Scholar 
Li, S. J., Zhu, J. H., Lubman, D. M., Zhou, H. & Tang, H. X. GlycoSLASH: concurrent glycopeptide identification from multiple related LC-MS/MS data sets by using spectral clustering and library searching. J. Proteome Res. 22, 1501–1509 (2023).Article 

Google Scholar 
Yang, Y. et al. GproDIA enables data-independent acquisition glycoproteomics with comprehensive statistical control. Nat. Commun. 12, 6073 (2021).Article 

Google Scholar 
Zeng, W. F. et al. MS/MS spectrum prediction for modified peptides using pDeep2 trained by transfer learning. Anal. Chem. 91, 9724–9731 (2019).Article 

Google Scholar 
Zhou, X. X. et al. pDeep: predicting MS/MS spectra of peptides with deep learning. Anal. Chem. 89, 12690–12697 (2017).Article 

Google Scholar 
Tarn, C. & Zeng, W. F. pDeep3: toward more accurate spectrum prediction with fast few-shot learning. Anal. Chem. 93, 5815–5822 (2021).Article 

Google Scholar 
Gessulat, S. et al. Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning. Nat. Methods 16, 509–518 (2019).Article 

Google Scholar 
Tiwary, S. et al. High-quality MS/MS spectrum prediction for data-dependent and data-independent acquisition data analysis. Nat. Methods 16, 519 (2019).Article 

Google Scholar 
Yang, Y. et al. In silico spectral libraries by deep learning facilitate data-independent acquisition proteomics. Nat. Commun. 11, 146 (2020).Article 

Google Scholar 
Lou, R. H. et al. DeepPhospho accelerates DIA phosphoproteome profiling through in silico library generation. Nat. Commun. 12, 6685 (2021).Article 

Google Scholar 
Zong, Y. et al. DeepFLR facilitates false localization rate control in phosphoproteomics. Nat. Commun. 14, 2269 (2023).Article 

Google Scholar 
Reily, C., Stewart, T. J., Renfrow, M. B. & Novak, J. Glycosylation in health and disease. Nat. Rev. Nephrol. 15, 346–366 (2019).Article 

Google Scholar 
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. Bert: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (eds Burstein, J. et al.) 4171–4186 (ACL, 2018); https://doi.org/10.18653/V1/N19-1423Cao, W. et al. Recent advances in software tools for more generic and precise intact glycopeptide analysis. Mol. Cell. Proteomics 20, 100060 (2021).Article 

Google Scholar 
Liu, J. et al. Methods for peptide identification by spectral comparison. Proteome Sci 5, 3 (2007).Article 

Google Scholar 
Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. Preprint at https://arXiv.org/1609.02907 (2016).Xu, K., Hu, W., Leskovec, J. & Jegelka, S. How powerful are graph neural networks? Preprint at https://arXiv.org/1810.00826 (2018).Veličković, P. et al. Graph attention networks. In Proc. 6th International Conference on Learning Representations (ICLR, 2018); https://doi.org/10.48550/arXiv.1710.10903Xiong, Z. et al. Pushing the boundaries of molecular representation for drug discovery with the graph attention mechanism. J. Med. Chem. 63, 8749–8760 (2020).Article 

Google Scholar 
Vaswani, A. et al. Attention is all you need. In Proc. Advances in Neural Information Processing Systems (eds von Luxburg, U. et al.) 5999–6009 (Curran Associates, 2017); https://doi.org/10.48550/arXiv.1706.03762Zhang, Y. et al. Comparative glycoproteomic profiling of human body fluid between healthy controls and patients with papillary thyroid carcinoma. J. Proteome Res. 19, 2539–2552 (2020).Article 

Google Scholar 
Qin, H. et al. Highly efficient analysis of glycoprotein sialylation in human serum by simultaneous quantification of glycosites and site-specific glycoforms. J. Proteome Res. 18, 3439–3446 (2019).Article 

Google Scholar 
Sun, W. et al. Glycopeptide database search and de novo sequencing with PEAKS GlycanFinder enable highly sensitive glycoproteomics. Nat. Commun. 14, 4046 (2023).Article 

Google Scholar 
Polasky, D. A., Geiszler, D. J., Yu, F. & Nesvizhskii, A. I. Multiattribute glycan identification and FDR control for glycoproteomics. Mol. Cell. Proteomics 21, 100205 (2022).Article 

Google Scholar 
Zhang, S. Spectrum and Retention Time Prediction for N-Glycopeptides Using Deep Learning. Master’s thesis, Univ. of Waterloo (2023).Kawahara, R. et al. Community evaluation of glycoproteomics informatics solutions reveals high-performance search strategies for serum glycopeptide analysis. Nat. Methods 18, 1304–1316 (2021).Article 

Google Scholar 
Klein, J., Carvalho, L. & Zaia, J. Expanding N-Glycopeptide identifications by fragmentation prediction and glycome network smoothing. Preprint at bioRxiv https://doi.org/10.1101/2021.02.14.431154 (2021).Zhang, Z. & Shah, B. Prediction of collision-induced dissociation spectra of common N-glycopeptides for glycoform identification. Anal. Chem. 82, 10194–10202 (2010).Article 

Google Scholar 
Yang, Y. & Fang, Q. Prediction of glycopeptide fragment mass spectra by deep learning. Nat. Commun. 15, 2448 (2024).Article 

Google Scholar 
Vizcaino, J. A. et al. 2016 update of the PRIDE database and its related tools. Nucleic Acids Res. 44, D447–D456 (2016).Article 

Google Scholar 
Zhang, Y. et al. Glyco-CPLL: an integrated method for in-depth and comprehensive N-glycoproteome profiling of human plasma. J. Proteome Res. 19, 655–666 (2020).Article 

Google Scholar 
Bollineni, R. C., Koehler, C. J., Gislefoss, R. E., Anonsen, J. H. & Thiede, B. Large-scale intact glycopeptide identification by Mascot database search. Sci. Rep. 8, 2117 (2018).Article 

Google Scholar 
Lin, Y. et al. A panel of glycopeptides as candidate biomarkers for early diagnosis of NASH hepatocellular carcinoma using a stepped HCD Method and PRM evaluation. J. Proteome Res. 20, 3278–3289 (2021).Article 

Google Scholar 
Pioch, M., Hoffmann, M., Pralow, A., Reichl, U. & Rapp, E. glyXtool(MS): an open-source pipeline for semiautomated analysis of glycopeptide mass spectrometry data. Anal. Chem. 90, 11908–11916 (2018).Article 

Google Scholar 
Zong, Y. Code for DeepGP. Zenodo https://doi.org/10.5281/zenodo.11911189 (2024).

Hot Topics

Related Articles