Mapping protein binding sites by photoreactive fragment pharmacophores

Library designThe library was designed as reported in our recent work19. Briefly, pharmacophore models were extracted from PDB structures with the ePharmacophore module of the Schrödinger software suite (version 2017-4), and clustered with scipy (version 1.0.1). Ligand preparation, conformer generation and pharmacophore screening were carried out with the Epik, Macromodel and Phase modules of the Schrödinger software suite, respectively (version 2022-4).Fragments in the Enamine primary amine collection were filtered for size (10–16 heavy atoms), primary amine group count of exactly one, and the absence of PAINS moieties25. Candidate fragments were screened against, and annotated with the full set of 2-, and 3-point pharmacophore models and stored as fingerprints with bits for pharmacophores present set to 1. We have used a diversity picker to select 160 fragments with maximized coverage and diversity across the 117 bit positions (pharmacophores). The final set of successfully synthesized 100 PhP fragments provided coverages of 88% and 75% for the 2- and 3-point pharmacophores, respectively. SMILES codes, images and pharmacophore fingerprints of the PhP fragments are included in Supplementary Data 1.General procedures for synthesis and compound characterizationPhP fragments were synthesized according to the general procedure outlined in Scheme 1. 3-(3-Methyl-3H-diazirin-3-yl)propanoic acid (0.025 g, 0.195 mmol) was dissolved in N,N-dimethylformamide (DMF), 1-[bis(dimethylamino)methylene]-1H-1,2,3-triazolo[4,5-b]pyridinium 3-oxid hexafluorophosphate (HATU, 0.071 g, 0.188 mmol) and DIPEA (0.105 mL, 0.600 mmol) were added. The reaction mixture was stirred for 15 min at room temperature. 0.15 mmol amine was added, then the mixture was stirred at RT for 16-72 h covered in tin foil, and progress was assessed by LCMS.Scheme 1General procedure for attaching the diazirine tag onto amine-containing fragments. Amide coupling with the HATU (Hexafluorophosphate Azabenzotriazole Tetramethyl Uronium) coupling reagent was utilized to award the photoaffinity-tagged fragments of the PhP library.After 16–72 h the reactions were processed as follows:

a.

DMF in the reaction mixtures was removed in a Genevac. The crude product was dissolved in chloroform (0.4 ml) and loaded onto a 1.0 g NH2-Isolute SPE column, pre-equilibrated with chloroform for 2 column volumes (CV’s). The product was eluted off the column using 2.5 mL of 10% methanol in ethyl acetate. This was repeated with another 2.5 mL of 10% methanol in ethyl acetate. The eluent was collected for each reaction in pre-weighed T-vials and dried under a stream of nitrogen in the Radleys blowdown apparatus to obtain a product.

b.

The reaction mixture was loaded to a prep-HPLC column and purified with a linear gradient from 5 to 100 vol% MeCN in water containing 0.1 vol% formic acid over 15 min. The solvent was then evaporated.

1H NMR was recorded in DMSO-d6, CDCl3 CD3CN or D2O solution at room temperature, on a Varian Unity Inova 500 (Varian, Palo Alto, CA, USA) (500 MHz) and on a Varian 300 spectrometer (300 MHz), with the deuterium signal of the solvent as the lock. Chemical shifts (δ) and coupling constants (J) are given in ppm and Hz, respectively, and the spectra are collated in Supplementary Data 2.HPLC-MS measurements were performed using a Shimadzu LC-MS-2020 device equipped with a Reprospher-100 C18 (5 µm; 100 × 3 mm) column and positive-negative double ion source (DUIS±) with a quadrupole MS analyzer in a range of m/z 50-1000. Samples were analyzed with gradient elution using eluent A (0.1% formic acid in water) and eluent B (0.1% formic acid in acetonitrile). Flow rate was set to 1 mL/min. The initial condition was 5% B eluent, followed by a linear gradient to 100% B eluent by 1 min, from 1 to 3.5 min 100% B eluent was retained; and from 3.5 to 4.5 min back to initial condition with 5% B eluent and retained to 5 min. The column temperature was kept at room temperature and the injection volume was 1–10 µL. Purity of compounds was assessed by HPLC with UV detection at 254 nm; all tested compounds were >95% pure.High-resolution mass spectrometric measurements were performed using a Q-TOF Premier mass spectrometer (Waters Corporation, Milford, MA, USA) in positive or negative electrospray ionization mode. Reactions were monitored with Merck silica gel 60 F254 TLC plates (Darmstadt, Germany). All chemicals and solvents were used as purchased from commercial suppliers. The column chromatography purifications were performed using Teledyne ISCO CombiFlash Lumen+ Rf. For buffer media exchange, a GE Healthcare PD SpinTrap™ G-25 desalting column was used.Detailed analytical results are included in Supplementary Note 1 for each compound.Sample preparation for STAT5B-NTD illumination and photocatalysis experimentsSTAT5B protein in 7 µM concentration in pH 7.4 PBS buffer was pre-incubated with 0.17 µL PhP (in 100 mM DMSO solution) for 1 h at room temperature in dark. Irradiation was carried out at 4 °C for 10 min using 365 nm UV lamp.The synthesis of short- and long-linker Ir-G2 photocatalysts was started from dMebppy41. First, we synthesized Gen 2 Iridium catalyst (cpd. 13 in ref. 41) that was not effective for labeling STAT5B-NTD, presumably due to the short propionic acid linker. However, the carboxylic acid analog with a longer PEG3 linker (Ir-G2-PEG3-COOH, or cpd. 17 in ref. 41) successfully labeled the protein target in the presence of BOP (but not of other activating agents like HATU, HCTU, TSTU or PyAOP) with a 27% yield (Supplementary Fig. S3).For the photocatalysis experiments, 1.4 µL photocatalyst (in 1.4 mM DMSO solution, 10 eq) was pre-incubated with 1.5 µL coupling reagent (0.75 µL, 5.6 mM EDC (20 eq)+0.75 µL 4.2 mM NHS and 1.5 µL 2.1 mM (15 eq) BOP for 30 min in dark. 30 µL STAT5B-NTD (in 7 µM concentration) was added and incubated further at room temperature overnight or at 37 °C for 1 h in dark. After addition of 0.14 µL PhP (in 100 mM DMSO) the solution was further incubated in dark for 1 h. Irradiation was carried out using 450 nm LED lamp (7.6Vx0A) for 10 min at 4 °C.Intact mass spectrometry (MS) screeningThe PhP library was screened against BRD4-BD1 and KRasG12D as follows. Using a Labcyte Echo® 555 Liquid Handler, 150 nL of fragment solution (20 mM) was transferred into a Greiner 384 low-volume plate (white) to prepare the probe plate. The probe plate was placed on ice. 15 μL of protein stock solution (1 μM BRD4-BD1 (GSK/GenScript) or KRasG12D (GSK/GenScript) in PBS buffer) was dispensed into wells containing fragments. The plate was left on ice for 15 min for incubation, then irradiated at 302 nm for 10 min, and centrifuged (1000 rpm, 1 min) to remove any bubbles. The plate was then analyzed by LCMS-TOF mass spectrometry (Agilent 1200 series liquid chromatography with Agilent Bio-HPLC PLRP-S (1000 Å, 5 µm × 50 mm × 1.0 mm, PL1312-1502) reverse phase HPLC column at 70 °C equipped with an Agilent G6224 time-of-flight (ToF), see Supplementary Note 2 for the full protocols). The deconvoluted spectra were analyzed using R Studio software. Spectra of the hits are reported in Supplementary Note 2.For screening against STAT5B-NTD, we have used a UHPLC-MS system that consisted of a Waters ACQUITY UPLC I-Class setup coupled with a Waters ACQUITY UPLC Peptide BEH C18 Column (130 Å, 1.7 µm, 2.1 mm × 100 mm), connected to a Waters Xevo G2-XS QT-ToF instrument equipped with a Waters Z-spray ESI source. During the analysis, the column temperature was maintained at a constant 60 °C, and a sample volume of 3 µL was injected for each analysis. Data acquisition was conducted in positive ion mode within the m/z 100–2000 (mass-to-charge ratio) range. For the full protocol and spectra of the hits, see Supplementary Note 2.Binding site identification by LC-MS/MS peptide mapping—BRD4-BD1 and STAT5B-NTDIn follow-up of the intact MS screening, the fragment hits were further analyzed by a Triple TOF 5600+ hybrid Quadrupole-TOF LC/MS/MS system, after digesting the resulting fragment-protein complexes by Trypsin/Lys C mix. Data acquisition and processing were performed using Analyst TF software version 1.7.1 (AB Sciex Instruments, CA, USA). Chromatographic separation was achieved on the Discovery® BIO Wide Pore C-18-5 (250 mm × 2.1 mm, 5 μm, 300 Å) HPLC column. MS/MS spectra were obtained on the 8 most abundant parent ions present in the TOF survey scan with the Information Dependent Acquisition (IDA) mode, and peaks were evaluated with PeakView® (version 2.2, Sciex) and Biologics Explorer (version 3.0.3, Sciex). The full sample preparation and data acquisition protocols, as well as the spectra of the reported hits, are available in Supplementary Note 3.Modeling the binding poses of fragment hits by dockingDocking calculations were performed on BRD4-BD1 (PDB ID: 7A9U17), KRasG12D (PDB ID: 4OBE51) and STAT5B-NTD with the same protocol. Briefly, the relevant X-ray structures were downloaded from PDB and were prepared using Protein Preparation Wizard52. The structure of STAT5B-NTD was homology modeled based on the published structures of STAT3-NTD (PDB ID: 4ZIA39). Ligands were prepared with Ligprep52, and docking was performed with the Induced Fit Docking protocol of Schrödinger. For the grid box generation, the experimental results (MS labeling data and NMR shift perturbations, where available) were used. At most 20 possible binding conformations were generated in the first docking step53. Redocking was done into structures within a 30 kcal/mol energy window from the best structure, and within the top 20 structures overall, using the single precision (SP) method.Cell viability measurementsPancreatic cancer cell lines PANC-1 (cat. no. CRL-1469), SW1990 (cat. no. CRL-2172) were obtained from ATCC (American Type Culture Collection). Genetically modified isogeneic colon cancer cell lines SW48-PAR and SW48-G12D were obtained from Horizon Discovery Ltd. PANC-1 (human pancreatic cancer, KRasG12D), SW1990 (human pancreatic cancer, KRasG12D), SW48-PAR (human colon cancer; parental cell line KRaswt), and SW48-G12D (human colon cancer; heterozygous knockin of the KRasG12D activating mutation) cancer cell lines were cultured in Roswell Park Memorial Institute Medium (RPMI; Biosera, Nuaille, France), supplemented with 10% heat-inactivated Fetal Bovine Serum (FBS; Biosera), and with 1% Penicillin/Streptomycin (Biosera). Cells were cultured in sterile T75 flasks with ventilation cap (Sarstedt, Nümbrecht, Germany) at 37 °C in a humidified atmosphere with 5% CO2 in ESCO CelCulture Incubator (ESCO, Friedberg, Germany). Manipulations with the cells were performed in biosafety cabinet (laminar) ESCO Sentinel Gold class II model AC2-4E8 (ESCO).For the evaluation of the in vitro antiproliferative activity of fragments, cell viability was determined with the MTT assay (3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyl-tetrazolium bromide from Sigma Aldrich). After standard harvesting of the cells by trypsin-EDTA (Biosera), 7 × 103 cells per well depending on the cell line, were seeded in serum-containing growth medium to 96-well plates and incubated. After 24 h, cells were treated with various concentrations of fragments (32 nM–100 μM), dissolved in serum-containing medium, and incubated under standard conditions. Control wells were treated with medium. Final concentration of serum was 2.5%, final concentration of DMSO was 0.2%. Treatment was for 72 h continuously.Afterwards, MTT assay was performed in order to determine cell viability, by adding 20 μL of MTT solution (5 mg/mL in PBS) to each well and after 2 h of incubation at 37 °C, the supernatant was removed. The formazan crystals were dissolved in 100 μL of a 1:1 solution of DMSO (Sigma-Aldrich):EtOH (Molar Chemicals) and the absorbance was measured after 15 min at λ = 570 nm by using a microplate reader (BioTek 800TS, Agilent, Santa Clara, CA, USA). The IC50 values of the fragments were calculated using GraphPad Prism 6 (GraphPad Software, San Diego, CA, USA). The experiments were done in triplicate, and each experiment was repeated two times.BRD4-BD1 expression, purification, crystallization, and structureBRD4 (residues 44-171, cloned into pNIC28-Bsa4 using LIC cloning, SGC ID: BRD4A-c001) was produced from E. coli (BL21) RR in TB media (Formedia) with expression induced with 1 mM IPTG when OD600 = 0.6 was reached at 18 °C overnight. The cells were lysed (50 mM HEPES, 500 mM NaCl, 5% glycerol) by sonication, purified by IMAC using Talon resin (50 mM HEPES, 500 mM NaCl, 5% glycerol, 300 mM imidazole), tag removed with TEV cleavage overnight at 4 °C and polished using gel filtration with Superdex 75 column. Purified protein in (50 mM HEPES, 500 mM NaCl, 5% glycerol) was concentrated to 5.6 mg/mL and crystallized in 150 nL drop in 1:2 protein: reservoir solution ratio in 20% PEG6000, 10% ethylene glycol, 0.1 M HEPES pH 7.0 and 0.2 M sodium chloride at 277 K using sitting drop vapor diffusion. Crystals formed after 28 days. PhP053 (100 mM, dissolved in MeOH, not DMSO) was diluted 10 times with reservoir solution added to the crystals and incubated for a further 1 h at 4 °C. Diffraction data were collected at i04 at Diamond Light source as part of BAG allocation mx28172, autoprocessed using the autoPROC pipeline54 and phased using BUSTER55 using 4MEN. Manual model rebuilding was done using the CCP4 cloud56 alternated with structure refinement was performed in COOT57 and REFMAC558. The resulting structure has been deposited in the PDB database (https://rcsb.org), with the accession code 8Q34. Refinement statistics are provided in Table 1, as well as in Supplementary Data 1.Table 1 Refinement statistics for the X-ray structure 8Q34Binding site identification of KRasG12D hits by LC-MS/MS peptide mappingModification sites were determined by proteolysis and reversed-phase LC-MS peptide mapping. Briefly, proteins were enzymatically digested in 25 mM NH4HCO3 solution after buffer exchange using Amicon Ultra-0.5 mL Centrifugal Filter units (10 kDa, Merck Millipore). Trypsin-LysC mixture and ProAlanase (Promega Corporation, Madison, USA) were used for the enzymatic digestion. Protein samples were reduced by dithiothreitol at 37 °C for 30 min. After reduction, proteins were digested using 1:50 enzyme:protein ratio or 4 h at 37 °C, followed by an additional short incubation with dithiothreitol (5 min, 37 °C). Overnight digestion was performed by Trypsin-LysC mixture in 50 mM NH4HCO3 solution at 37 °C. Tryptic digestion was stopped by adding formic acid in a final concentration of 0.2% (V/V). ProAlanase digestion was performed for 4 h at 37 °C in 50 mM HCl and was stopped by heating at 90 °C for 10 min. After digestions an additional short incubation with dithiothreitol was repeated (5 min, 37 °C).Mass spectrometric experiments were performed on a high-resolution hybrid quadrupole-time-of-flight mass spectrometer equipped with a cyclic ion mobility separator (Waters Select Series Cyclic IMS, Waters Corp., Wilmslow, U.K.). Chromatographic separation was performed using a Waters Acquity I-Class UPLC system, coupled directly to the mass spectrometer. Waters Acquity CSH Peptide C18 UPLC column (2.1 × 150 mm, 1.7 µm) was used for chromatography. Gradient elution was performed under the following parameters: eluent A: 0.1% formic acid in water, eluent B: 0.1% formic acid in acetonitrile; flow rate: 300 µL/min; column temperature: 60 °C; gradient: 2 min: 2%B, 80 min: 45%B, 81 min: 85%B. HDMSE experiments were performed using a single-pass cyclic ion mobility separation and fragmentation in the transfer cell with collision voltage ramping. MS data acquisition was performed with the following parameters: m/z 50–2000, V-mode, scan time: 0.3 s, single Lock Mass: leucine enkephalin; low energy: 6 V, high energy: ramping 19-45 V. BiopharmaLynx 1.3.5 software (Waters Corp., Wilmslow, U.K.) was used to for data analysis. Spectra of the hits are reported in Supplementary Note 3.Binding site identification of KRasG12D hits by NMR spectroscopyNMR samples contained 70 μM 15N-labeled GDP-bound KRasG12D protein, 100–600 μM binding partner, 10 mM MgCl2, 3 mM NaN3, in PBS buffer, 5% DMSO-d6, 7% D2O, 0.5% DSS and the pH was set to 7.4. 1H,15N-SOFAST-HMQC (fast version of 1H,15N-HSQC) NMR spectra were acquired at 298 K on a Bruker AVANCE III spectrometer (Bruker Biospin, Rheinstetten, Germany) operating at 700.05 MHz for 1H and 70.94 MHz for 15N, equipped with a 5 mm Prodigy TCI H&F-C/N-D, z-gradient probe head. Temperature was calibrated by standard methanol solution. The chemical shifts were referenced with respect to the 1H-resonance of an internal DSS standard, while 15N chemical shifts were referenced indirectly via the gyromagnetic ratios according to the IUPAC conventions. All NMR data were processed with Bruker TOPSPIN 3.6 and analyzed in POKY software59. The shifted crosspeaks were compared to the free KRasG12D chemical shifts60. Spectra of the hits are reported in Supplementary Note 4.Kras-SOS exchange assayMANT-GDP loading assay: KrasG12D protein was first buffer exchanged into low magnesium buffer (20 mM HEPES-NaOH (pH 7.5), 50 mM NaCl, 0.5 mM MgCl2) using a NAP5 column (catalog no.: 17-0583-1, Cytiva). The proteins were then incubated with 20-fold molar excess of N-methylanthraniloyl (MANT)-GDP (catalog no.: 69244, Sigma-Aldrich) in loading buffer (50 mM NaCl, 20 mM HEPES-NaOH [pH 7.5], 0.5 mM MgCl2, 10 mM EDTA, and 1 mM Dithiothreitol (DTT)) in a total volume of 200 μL at 20 °C for 90 min. The reaction was stopped by adding MgCl2 to a final concentration of 10 mM, then incubated at 20 °C for 30 min. The unbound MANT-GDP was removed using the NAP-5 column equilibrated with nucleotide exchange buffer (40 mM HEPES-NaOH (pH 7.5), 50 mM NaCl, 10 mM MgCl2, 2 mM DTT).MANT-GDP exchange assay: First the MANT-GDP bound Kras G12D protein (in a final 1 μM molar concentration) was preincubated with the inhibitor molecules for 60 min. After the preincubation the samples were exposed to UV light for 10 min with a wavelength of 366 nm. The UV-treated MANT-GDP Kras-inhibitor mix was loaded into a black 384 well microplate. The nucleotide exchange reaction was initiated by adding 100 fold molar excess of GppNHp (catalog no.: G0635, Sigma-Aldrich), a non-hydrolyzable GTP analog and the SOS1 exchange domain (catalog no.: GE02, Cytoskeleton, Inc.) protein in 0.5 μM final concentration. The change in fluorescence intensity was measured every 30 s in room temperature for 60 min on an EnSpire plate reader (PerkinElmer, Inc.). The measured fluorescence values were fitted to a single exponential function by using GraphPad Prism 8 software.STAT5B-NTD expression, purification, and MST measurementsRecombinant protein expression was performed similarly to our recently published protocol for expressing full-length STAT5B61. STAT5B N-terminal domain (1-123, NCBI Accession Number NP_036580.2) was codon optimized and synthesized by Genscript and cloned into pET28b+ plasmid using NheI and XhoI cloning sites with an N-terminal His-SUMO tag. The plasmid was used to transform BL21 RILP cells and single colonies were selected and used to inoculate 3 mL cultures in Super Broth (with 34 µg/mL chloramphenicol and 50 µg/mL kanamycin). Once cultures reached an OD600 of ~1.0, they were transferred to 1 L Super Broth (supplemented with 10 mM MgSO4, 0.1% [v/v] glucose, 34 µg/mL chloramphenicol and 50 µg/mL kanamycin). At OD600 = 1.5, the temperature was reduced to 16 °C and the media was supplemented with 1 mM IPTG and 3% (v/v) ethanol. The cells were harvested following 20 h induction and frozen at −80 °C.For protein purification, the cell pellets were thawed in lysis buffer and ruptured through sonication. The cell lysate was cleared by centrifugation and loaded onto a 3 mL Ni2+-NTA resin (GE Healthcare). The column was washed and the protein was eluted and loaded onto a Superdex 200 Increase 10/300 GL column (GE Healthcare). The fractions were treated with His-Ulp1 protease to cleave the His-SUMO tag and passed through a 1 mL Ni2+-NTA resin to remove any residual tag. The flow through was collected and assessed for purity via SDS-PAGE and the protein was dialyzed into 100 mM HEPES pH 7.4, 2% (v/v) glycerol. Protein concentration was determined by BCA assay (Thermo Fisher Scientific) and aliquots of N-terminal domain were flash-frozen in liquid nitrogen and stored at −80 °C. The compositions of all purification buffers are listed in additional detail in ref. 61.For the microscale thermophoresis (MST) studies, we prepared 16 two-fold serial dilutions of compounds starting from 500 μM. Titration series were prepared that contained 10 μL of compounds’ solutions of varying concentrations and 10 μL RED-NHS 2nd Generation labeled STAT5B NTD with a concentration of 168 nM for compound PhP097, and a concentration of 84 nM for compound PhP065. Final buffer composition included 1X PBS containing 0.5% DMSO. All measurements were taken in Premium Coated Capillaries on a Monolith NT.115 instrument (NanoTemper Technologies, Munich, Germany) using 80% infrared laser power for compound PhP097, 60% infrared laser power for compound PhP065, and an LED excitation source with λ = 650 nm at a temperature of 25 °C. Results were expressed as the mean of two separate experiments, with three technical replicates each. GraphPad Prism 9.5.1 software was used to fit the data and to determine the KD values.MV4-11 and MOLM-13 cell lines were purchased from DSMZ (Braunschweig, Germany), and grown at 37 °C and 5% CO2 in RPMI 1640 medium (GibcoTM, Thermo Fisher Scientific). Media were supplemented with 10% fetal calf serum (FCS), 10 U/mL penicillin, 10 µg/mL streptomycin and 2 mM L-glutamine (all GibcoTM, Thermo Fisher Scientific).To determine the IC50 of the selected compounds on the cell lines, CellTiter-Blue® cell viability assay (Promega) was performed. For this, cells were seeded in 96-well flat bottom plates at a cell density of 10,000 cells/well. Cells were treated in triplicates with the compound of interest at various concentrations or with 10 μM Bortezomib (S1013; Selleck Chemicals, Houston, TX, USA) as a positive control. Cell viability of treated cell lines was measured using CellTiter-Blue® after 72 h incubation. Plates were measured using a GloMax® plate reader (Promega) and IC50 values were determined by non-linear regression using the GraphPad Prism 9.1.1 (GraphPad Software, Inc.) and the data are reported as mean values ± SEM.Reporting summaryFurther information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Hot Topics

Related Articles