Collaborative weighting in federated graph neural networks for disease classification with the human-in-the-loop

Graph Neural Networks (GNNs) play a crucial role in biology, especially in biomedicine, by effectively capturing complex relationships within biological systems represented as graphs1. Their ability to model interconnected networks is valuable for integrating diverse biological data types, aiding in the understanding of molecular mechanisms, disease pathways, and potential therapeutic targets and they facilitate the discovery of new structural classes, particularly in heterogeneous information networks2,3,4, highly relevant e.g. in drug repositioning to discover new indicators of drugs5. The capacity of graph neural networks to represent interlinked networks proves beneficial for amalgamating various types of biological data. This assists in comprehending molecular mechanisms, disease pathways, and potential therapeutic targets. Additionally, they play a crucial role in identifying new structural categories, such as in the development of antibiotics6, or to predict core gene candidates for complex diseases7, to give only a few examples.Network Medicine leverages network science to elucidate disease mechanisms, employing a variety of analytical methods to construct molecular networks such as protein-protein interactions and gene regulatory networks, and applies these to Omics Big Data for advancements in diagnosis, prognosis, and treatment of complex diseases8,9,10,11. GNNs have shown great promise in predicting interactions, identifying disease subtypes, and leveraging large-scale, multi-modal data in biomedicine12,13,14,15.At the same time, Federated Learning (FL) enhances collaborative model training by enabling decentralized learning without sharing any critical data. However, FL can not only support diagnostic data protection, but also has other advantages, namely improving the reproducibility and reliability of AI models outside the domain and potentially optimising results16,17. In combination with GNNs this holds great promise for advancing personalized medicine, disease understanding, and drug discovery on a large scale18. Due to the fact that most existing medical data is not fully utilized because it is hoarded in data silos due to privacy concerns19, federated learning may offer a solution for the future of digital health20.Incorporating the Human-in-the-Loop (HITL) in federated learning introduces a critical dimension of human oversight and interaction, enhancing the model’s interpretability, reliability, and adaptability. The benefit of bringing the human into the algorithmic loop has been shown in a wide range of studies21,22,23,24. This integration is particularly pivotal in domains requiring nuanced decision-making, such as healthcare and personalized services, where human expertise and ethical considerations play a significant role. Integrating a human expert into federated learning with GNNs adds an extra layer of refinement and interpretability.In this work we delineate a novel approach for disease classification employing federated Graph Neural Networks (GNNs) augmented with collaborative weighting, which innovatively incorporates the expertise of human domain professional knowledge directly into the deep learning process. The foundational work of Pfeifer et al.25,26 utilized a Protein-Protein Interaction (PPI) network as the underlying structure for the GNNs. In this model, each patient is represented as a unique PPI network, with nodes augmented by specific molecular markers derived from gene expression and DNA methylation data. This modeling approach results in a graph classification problem and can be applied to any binary outcome class, such as distinguishing between healthy patients and those with a specific disease.In a significant advancement we employed explainable AI (xAI) methods to dissect the PPI knowledge graph into several subgraphs25,26. These subgraphs then serve as the foundation for constructing an ensemble classifier. The predictive model functions through a majority voting mechanism based on insights gleaned from these subgraphs, thereby enhancing the robustness and accuracy of disease classification. This enables the experts to gain deeper insights into the underlying explanatory factors and thus strengthen conceptual understanding and trust27.Here, we introduce an advanced collaborative and interactive framework that seamlessly integrates human expertise into the aforementioned algorithmic process, a paradigm shift in the realm of machine learning. This integration enables a more nuanced exploration of the ensemble classifier, providing human experts with the capability to adjust the predictive model by experimenting with varying weights assigned to the ensemble subgraphs.Crucially, this framework empowers experts to delve into the significance of specific proteins and genes within the Protein-Protein Interaction (PPI) knowledge graph. Leveraging the principles of explainable AI, it facilitates a deeper understanding of molecular interactions and their implications in disease pathology. This approach is particularly transformative as it allows the infusion of external expert knowledge into the network, knowledge that may not be inherently present in the training dataset. Such an inclusion of expert insight, especially regarding the nuances of gene functions and interactions, can significantly enhance the model’s predictive accuracy and reliability. This synergy between algorithmic robustness and human expertise stands to make substantial contributions to the field of digital medicine generally and biomedical research specifically.Our framework has been operationalized as an application within the FeatureCloud platform28. This platform is architecturally composed of a global frontend, backend, and a localized controller. It employs Docker, a pivotal technology, to segregate local computational components from sensitive data infrastructures, thus maintaining data integrity. FeatureCloud streamlines the complexities inherent in distributed systems, offering a robust and scalable infrastructure for conducting Federated Learning analyses across multiple institutions. Moreover, it supports efficient algorithm implementation. A significant aspect of FeatureCloud is its integrated artificial intelligence store, which serves as a nexus for the communal exchange of federated algorithms, enhancing collaborative scientific efforts. Central to FeatureCloud’s design is its commitment to data privacy. The platform incorporates privacy-enhancing technologies that safeguard locally shared models, ensuring compliance with the European General Data Protection Regulation (GDPR)29, thereby fortifying trust in its data management practices.

Hot Topics

Related Articles