Adding metabolic tasks to human GEM models to improve the study of gene targets and their associated toxicities

Genomic-scale metabolic modelsA metabolic network can be represented as a triplet (M, R, S) consisting of sets of metabolites (M) and reactions (R). If the network consists of m metabolites and n reactions, the stoichiometric matrix, $S\in M_{m \times n}(\mathbb {R})$, establishes the link between metabolites and reactions in the network.Each network state is represented by a vector $v\in \textbf{R}^n$, where $v_i$ denotes the activity level of reaction $r_i$.The variation in concentrations of the metabolites is summarized in Equation (1)$$\begin{aligned} \frac{dx}{dt}=S\cdot v \end{aligned}$$
(1)
It is assumed that the network is in a steady state, which means that the concentrations of internal metabolites remain constant. This constraint is expressed as equation (2), where only internal metabolites must be included as rows in this formulation’s stoichiometric matrix S.$$\begin{aligned} S\cdot v=0 \end{aligned}$$
(2)
Each reaction in R has upper ($u_i$) and lower ($l_i$) bounds on its reaction rates $v_i$, imposing constraints on any flux vector (Eq. 3).$$\begin{aligned} l_i\le v_i \le u_i; \quad \forall r_i \in R \end{aligned}$$
(3)
A network mode is a flux vector that satisfies equations (2) and (3). The set of all network modes is the feasible cone, C.$$\begin{aligned} C=\{ v\in \textbf{R}^n \, | \; S\cdot v = 0, \; l_i\le v_i\le u_i,\;\forall r_i\in R\} \end{aligned}$$The support of a mode $v \in C$ is the set of reactions that have a nonzero flux in v:$$\begin{aligned} supp(v)=\{r_i \in R\; | \; v_i\ne 0 \} \end{aligned}$$Human1 model and metabolic tasksThis paper makes use of the most recent reconstruction of human metabolism, which is Human1 version 1.17. This reconstruction consists of 13417 reactions, 10138 metabolites (4164 of which are unique), and 3625 genes. The maintenance of Human-GEM takes place in a version-controlled GitHub repository managed by the Systems and Synthetic Biology department at Chalmers University of Technology in Gothenburg, Sweden8.Generating context-specific models is a complex process influenced by various factors, including the algorithms and reference models used. This variation can significantly impact the biological interpretation of the produced models. Therefore, it is essential to minimize this variation to ensure that the model can perform a range of metabolic tasks. This means it should be able to produce specific output products from a given set of input substrates. The ability to perform such tasks is crucial for accurately generating the models9.There are several methods to generate tissue or cell-specific models verifying metabolic tasks7,10,11. The capacity of a model to accomplish a task can be tested by restricting the input or output of metabolites. Additionally, allowing an internal reaction can be considered. If the model can perform the task with these restrictions, it means that the model can do so, and vice versa. In other words, at least one feasible flux in the model implies its ability to perform the task.Specialists agree that a human cell can perform 210 metabolic tasks7. However, not all metabolic models require the completion of all 210 tasks11 identified 57 crucial metabolic tasks for the survival and operation of any human cell.Metabolic tasks are divided into eleven groups (rephosphorylation of nucleoside triphosphates, de novo synthesis of nucleotides, uptake of essential amino acids, de novo synthesis of key intermediates, de novo synthesis of other compounds, protein turnover, electron transport chain and TCA, beta-oxidation of fatty acids, de novo synthesis of phospholipids, vitamins and co-factors and growth) that can be grouped into four broad categories: supply of energy and redox, internal conversion processes, utilization of substrate, and synthesis of metabolites. Examples of these tasks include the rephosphorylation of nucleoside triphosphates or the production of key intermediates such as glycerate 3-phosphate or erythrose 4-phosphate. Moreover, metabolic tasks also encompass biomass production (BP) under restricted Ham’s medium. A list of metabolic tasks can be found in the Supplementary Material (Table S1).In this paper, we used the ftINIT algorithm available in the RAVEN Toolbox for Matlab to create cell-line specific models representing specific tissues or cell types, as recommended by the Human-GEM team12.Genetic minimal cut setsA cut set is a set of reactions that hits all the modes in a given set T, meaning it intersects with every element in T. A Minimal Cut Set (MCS) is a cut set where no proper subset of it is also a cut set for the target set T. A MCS is a minimal set of reactions that, when inactivated simultaneously, prevents the network from entering any state in T. This concept is explained in detail in6.Metabolic networks often include gene information represented as gene protein rules (GPRs). GPRs describe Boolean expressions that identify which gene combinations need to be active to allow flux through a reaction, denoted as r. These GPRs will enable the expansion of the concept of minimal cut sets to that of genetic minimal cut sets or gMCS.These genetic cut sets consist of a group of genes, represented by G, which can be deactivated to prevent any element t in the target set T from being a valid mode in the altered network13,14. A gCS for T is minimal, or a gMCS, if no proper subset of G can also serve as a gCS for T.Several methods have been proposed for calculating gMCSs. They can be summarized as follows: 1) testing gene combinations of increasing length and filtering the minimal gCS through inclusion and limiting the search space, like SLFinder15, fastSL16 or rapidSL17); 2) computing (a base of) the target set and computing MCSs as hitting sets of their supports18,19); and finally, 3) utilizing the relationship between cut sets of the network and elementary flux modes of a specific dual network and using the K-shortest EFM algorithm20,21. This latter method can also be applied by adding genes to the network13,22 or introducing a gene matrix14. Recently, Guil and García23 has proposed a new method based on limiting the search space and computing hitting sets by introducing the concept of a k-representative subset of the target set.The length of a gMCS is defined as the number of genes it contains. A gMCS of length 1 is also an essential gene, while gMCSs of greater length are called synthetic lethality.Toxicities and drug targetsAs previously mentioned, altering a cell model’s genes can impact its generic or context-specific behavior.When we begin with a generic model, any gMCS for a metabolic task should be considered a generic toxicity. If we block all the genes in the gMCS simultaneously, we can conclude that the metabolic task is impossible in this model and any specific-context models derived from it. This means that every model becomes infeasible.As a next step, gMCSs for context-specific healthy models represent toxicities for the corresponding tissues. These toxicities can be generic or specific to certain tissues. The study can also focus on identifying toxicities that affect many tissues or only a single tissue, which is especially important.In contrast, gMCSs for an unhealthy model of a cell can be regarded as targets for therapeutic intervention. These targets must be calculated and evaluated for their potential toxicity to healthy tissues, both in a general and context-specific manner.Previous studies have focused on using biomass production as the sole criterion for determining gene essentiality in cancer cells24. However, these studies have disregarded other critical metabolic functions. We performed these calculations under the more general condition to compare the results obtained using either biomass production alone or all essential metabolic tasks. Then, we removed the gMCSs involved in biomass production. This allowed us to examine the potential of metabolic tasks in identifying toxicities or genetic targets.

Adding metabolic tasks to human GEM models to improve the study of gene targets and their associated toxicities

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis

Chemistry wordoku #062 | Puzzle

Multi-output prediction of dose–response curves enables drug repositioning and biomarker discovery

Hot Topics

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis

Related Articles

Balancing Act: Pregnancy and Bipolar Disorder

Cohesion at the cellular level: flexible yet stable

Gut bacteria influence responses to immunotherapy in patients with asbestos related cancer

Quick Links

Must Read

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis

Chemistry wordoku #062 | Puzzle

Popular Articles

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis