Functional genetic variants and susceptibility and prediction of gestational diabetes mellitus

Study populationAll subjects who met the following inclusion criteria were enrolled in the Affiliated Hospital of Guilin Medical University from September 2014 to April 2016: singleton pregnancy, no family relationship and no metabolic disease, such as type 1/2 diabetes mellitus. A routine 75-g oral glucose tolerance test (OGTT) was performed between 24 and 28 weeks of gestation. According to the standards of the International Association of Diabetes and Pregnancy Research Groups (IADPSG), women can be diagnosed with GDM if their fasting plasma glucose (FPG) is ≥ 5.1 mmol/L, 1-h plasma glucose (1hPG) is ≥ 10.0 mm/L or 2-h plasma glucose (2hPG) is ≥ 8.5 mmol/L.At the initial discovery stage, 96 GDM patients and age and pre-BMI matched 96 healthy pregnant women from the same period were recruited to conduct a genome-wide association study (GWAS) for screening GDM associated SNPs (GDM-SNPs) by using infinium Asian Screening Array (ASA, illumina) BeadChip. During the validation phase, singleton pregnant women of the same conditions were recruited, and candidate SNPs were genotyped in 554 GDM patients and 641 healthy pregnancies. In addition, biological samples from the other 42 normal pregnant women, including peripheral whole blood and placental tissues, were collected to detect the biological functions of the positively associated variants.The Ethics Committee of Guilin Medical University approved this research (Number: GLMC20131205), and the study was conducted in accordance with the Declaration of Helsinki. All included subjects signed informed consent forms prior to study procedures. The details of this study design are depicted in the flowchart in Fig. 1.Figure 1The flowchart of the study design. TFBS indicated transcription factor-binding sites, e QTL indicated expression quantitative trait locus.Infinium Asian screening array (ASA)All DNA samples were extracted using DNA-extraction kits (Tiangen Biotech). Genotyping module of Genomestudio v2.1 (illumina) was used to call the genotype, and to obtain high-quality data for GWAS. We pruned the data set of discovery stage with the following criteria: (1) SNP call rate > 95%, and a threshold for Hardy–Weinberg equilibrium (HWE) of 0.0001, minimum allele frequencies (MAF) < 1% and sex chromosome SNP sites; (2) Sample call rates > 95%; In addition, to exclude closely related individuals, we calculated genome-wide identity by descent (IBD) for each pair of samples and removed samples with PI-HAT > 0.25. We took group analysis quality control from 1000Geomics Northern and Western European Ancestry (CEU), Japanese in Tokyo (JPT) and Han Chinese in Bejing (CHB) database to Confirm whether the sample grouping meets expectations and detect outlier samples.Clinical and biochemical characteristicsClinical and biological characteristics, including age, prepregnancy weight (kg), height (m), systolic blood pressure (SBP), diastolic blood pressure (DBP), fasting plasma glucose (FPG), 1-h plasma glucose (1hPG), 2-h plasma glucose (2hPG), triglyceride (TG), total cholesterol (TC), haemoglobin A1c (HbA1c), low-density lipoprotein cholesterol (LDL-c) and high-density lipoprotein cholesterol (HDL-c), etc., were obtained from a unified questionnaire and patient medical records.Candidate SNP selection and genotypingPreliminary selection of candidate SNPs was based on the strength of the association effect on GDM (P < 1.0 × 10−3) according to the Infinium Asian Screening Array (ASA) BeadChip. The SNP function prediction (FuncPred) tool (https://manticore.niehs.nih.gov/snpinfo/snpfunc.html) was subsequently used for screening potential functional variants in the Chinese Han population in Beijing (CHB) with minimum allele frequencies (MAF) greater than 0.05.The candidate variants were genotyped via the Sequenom MassARRAY platform. The multiplex PCR master mix was composed of 1.0 μl of template DNA (20 ~ 100 ng/μl), 1.850 μl of ddH2O, 0.625 μl of 1.25 × PCR buffer (15 mmol/L MgCl2), 0.325 μl of 25 mmol/L MgCl2, 0.1 μl of 25 mmol/L dNTPs, 1 μl of 0.5 μmol/L primer mix, and 0.1 μl of 5 U/μl HotStar Taq polymerase. The reaction was conducted at 94 °C for 15 min, followed by 45 cycles at 94 °C for 20 s, 56 °C for 30 s and 72 °C for 1 min, with a final incubation at 72 °C for 3 min. The primers used are listed in Supplemental Table S1.Functional analysis of positively associated SNPsFor positively associated SNPs located in TFBSs, the Alibaba 2.1 tool (http://gene-regulation.com/pub/programs/alibaba2/index.html ) was used to explore potential biological functions. In addition, to determine whether the SNP was an expression quantitative trait locus (eQTL), we also carried out validated experiments in our study.According to the Aidlab DNA Extraction Kit (Aidlab Biotechnology Co., Ltd., China), genomic DNA was extracted from peripheral blood of 42 healthy pregnant women, and then the optical density values of each sample at 260 nm and 280 nm were measured using a NanoDrop spectrophotometer (Thermo Scientific, Waltham, MA, USA) to determine the DNA concentration and purity. Next, the genotypes of the candidate SNPs were determined using Kompetitive Allele Specific Polymerase Chain Reaction (KASP)27 in a StepOnePlus™ real-time PCR system (Thermo Fisher Scientific, Life Technologies Holding Pte Ltd., China). The 10-µl reaction system contained 5 µl of Flu Arms 2 × PCR mix, 0.5 µl of three specific primers (F1: 0.1 µl, F2: 0.1 µl, and R: 0.3 µl), 0.5 µl (25–150 ng) of DNA and 4 µl of ddH2O. The cycling conditions were as follows: hot-start Tap activation at 95 °C for 3 min, followed by 10 touchdown cycles at 95 °C for 15 s and at 61–55 °C for 60 s (61 °C decreasing to 0.6 °C per cycle to achieve a final annealing and elongation temperature of 55 °C), followed by 30 amplification cycles at 95 °C for 15 s, 55 °C for 60 s and postread at 30 °C for 60 s. The primer sequences are shown in Supplemental Table S1.Total RNA was extracted from the placental tissues of 42 normal pregnant women using TRIzol reagent (Invitrogen, Carlsbad, CA, USA) according to the manufacturer’s instructions. The concentration and purity of the extracted RNA were tested using a Thermo Scientific Nanodrop-2000c microspectrophotometer. Total RNA (2 µg) was reverse transcribed into cDNA according to the instructions for the reverse transcription kit (HaiGene, Harbin, China). Finally, quantitative real time polymerase chain reaction (QRT-PCR) was performed using the GLPBIO SYBR Green qPCR Mix (2 ×) kit on the StepOne Plus TM real-time PCR system. The 10 µl RT‒qPCR system contained 1 µl of cDNA template, 5 µl of 2 × SYBR Green PCR Mastermix, 2 µl of forwards and reverse primer concentrations and 3.4 µl of DEPC ddH2O. The PCR mixtures were denatured at 95 °C for 10 min, followed by 40 cycles of 95 °C for 15 s and 65 °C for 60 s. The 2^(− ΔΔCt) method was used to quantify gene expression, with GAPDH serving as an internal control28. The primer sequences are shown in Supplemental Table S1.Data processingIn this study, the data were processed with IBM SPSS Statistics 28 for Windows (IBM Corp., Armonk, NY, USA) and R 4.3.1 software. Clinical and biochemical variables are shown as the mean ± SD or percentage and were analysed using independent sample t tests or chi square (χ2) tests. Logistic regression analysis was adopted to evaluate the association between variants and GDM risk with the odds ratio (OR) and its corresponding 95% confidence interval (CI). One-way ANOVA was used to compare expression levels among the different genotypic samples. Additionally, univariate logistic regression and multivariate regression analysis by forwards stepwise selection with the Akaike information criterion (AIC) were employed to determine the clinical risk factors for GDM.A predictive nomogram model composed of clinical risk factors and positive SNPs was eventually constructed using the R package “rms”. The area under the receiver operating characteristic curve (AUC) was used to evaluate the model’s performance. The calibration curve by internal validation with a bootstrap method with 1000 resamples was generated to assess the level of consistency between the predicted and observed values. The clinical utility and net benefit were estimated by decision curve analysis (DCA). Finally, a web-based interactive dynamic nomogram was established via the R package “DynNom”. A two-sided test was adopted, and P values < 0.05 were considered to indicate statistical significance.

Hot Topics

Related Articles