Frequent CHD1 deletions in prostate cancers of African American men is associated with rapid disease progression

Institutional Review Board—Center for Prostate Disease Research (CPDR)The Uniformed Services University of the Health Sciences’ (Department of Defense) Institutional Review Board (OHRP #IRB00000968; FWA #FWA00005897) reviewed the work in this study and was “determined to be considered research not involving human subjects as defined by 32 CFR 219.102(e) because the research involves the use of de-identified specimens and data not collected specifically for this study.” (IRB protocol #910230). 32 CFR 219 is the Department of Defense’s adoption of the Common Rule (45 CFR 46) and also adheres to DoD Instruction 3216.02 titled, “Protection of human subjects and adherence to ethical standards in DoD-conducted and -supported research”. An informed consent form was not utilized for this study. A full HIPAA Waiver was granted for the use of the data in the Center for Prostate Disease Research (CPDR) Database Repository (IRB protocol #GT90CM). This study used already banked specimens and data from consented participants who agreed to the future use of their specimens and data from the CPDR’s repositories:

CPDR Biospecimen Bank (IRB protocol number #393738) at the Walter Reed National Military Medical Center IRB (OHRP #IRB00008418; FWA #FWA00017749).

CPDR Database Repository (IRB protocol number #GT90CM) at the Uniformed Services University of the Health Sciences IRB (OHRP #IRB00000968; FWA #FWA00005897).

Cohort selection and tissue microarray (TMA) generationThe aggregate cohort was composed of 2 independently selected cohort samples from Bio-specimen bank of Center for Prostate Disease Research and the Joint Pathology Center. Whole-mount prostates were collected from 1996 to 2008 with minimal follow-up time of 10 years. Self-reported race was validated by genomic ancestry analysis showing an 95% accuracy43. The first cohort of 42 AA and 59 EA cases was described before7,43. Similarly, the second cohort of 50 AA and 50 EA cases was selected based on the tissue availability (>1.0 cm tumor tissue) and tissue differentiation status (1/3 well differentiated, 1/3 moderately differentiated and 1/3 poorly differentiated).Patients who have donated tissue for this study also contributed to the long-term follow-up data (the mean follow-up time was 14.5 years). TMA block was assigned as 10 cases each slide and each case with 2 benign tissue cores, 2 Prostatic intraepithelial neoplasia (PIN) cores if available and 4–10 tumor cores covering the index and non-index focal tumors from formalin fixed paraffin embedded (FFPE) whole-mount blocks. The description of numbers of patients, tumors and tumor cores of combined cohort was in Supplementary Table 1d. All the blocks were sectioned into 8 µM tissue slides for FISH staining.Fluorescence in situ hybridization (FISH) assayA gene-specific FISH probe for CHD1 was generated by selecting a combination of bacterial artificial chromosome (BAC) clones (Thermo Fisher Scientific, Waltham, MA) within the region of observed deletions near 5q15-q21.1, resulting in a probe matching ca. 430 kbp covering the CHD1 gene as well as some upstream and downstream adjacent genomic sequences including the complete repulsive guidance molecule B (RGMB) gene. Due to the high degree of homology of chromosome 5-specific alpha satellite centromeric DNA to the centromere repeat sequences on other chromosomes, and the resulting potential for cross-hybridization to other centromere sequences, particularly on human chromosomes 1 and 19, a control probe matching a stable genomic region on the short arm of chromosome 5—instead of a centromere 5 probe—was used for chromosome 5 counting (Supplementary Fig. 1e). The FISH assay of CHD1 was performed on TMA as previously described7. The green signal was from probe detecting control chromosome 5 short arm and the red signal was from probe detecting CHD1 gene copy. The FISH-stained TMA slides were scanned with Leica Aperio VERSA digital pathology scanner for further evaluation. The criteria for CHD1 deletion was that in over 50% of counted cancer cells (with at least 2 copies of chromosome 5 short arm detected in one tumor cell) more than one copy of CHD1 gene had to be undetected. Examining tumor cores, deletions were called when more than 75% of evaluable tumor cells showed loss of allele. Focal deletions were called when more than 25% of evaluable tumor cells showed loss of allele or when more than 50% evaluable tumor cells in each gland of a cluster of two or three tumor glands showed loss of allele. Benign prostatic glands and stroma served as built-in control.The sub-clonality of CHD1 deletion was presented with a heatmap showing CHD1 deletion status in all the given tumors sampled from whole-mount sections of each patient. The color designations were denoted as: red color (full deletion) meaning all the tumor cores carrying CHD1 deletion within a given tumor, yellow color (subclonal deletion) meaning only partial tumor cores carrying CHD1 deletion within a given tumor and green color (no deletion) meaning no tumor core carry CHD1 deletion (Supplementary Table 1b).Statistics analysisThe correlations of CHD1 deletion and clinic-pathological features, including pathological stages, Gleason score sums, Grade groups, margin status, and therapy status were calculated using an unpaired t-test or chi-square test. Gleason Grade Groups were derived from the Gleason patterns for cohort from Grade group 1 to Grade group 5. Due to the small sample sizes within each Grade group, Grade group 1 through Grade group 3 were categorized as one level as well as Grade group 4 through Grade group 5. A BCR was defined as either two successive post-RP PSAs of ≥0.2 ng/mL or the initiation of salvage therapy after a rising PSA of ≥0.1 ng/mL. A metastatic event was defined by a review of each patient’s radiographic scan history with a positive metastatic event defined as the date of a positive CT scan, bone scan, or MRI in their record. The associations of CHD1 deletion and clinical outcomes with time to event outcomes, including BCR and metastasis, were analyzed by a Kaplan–Meier survival curves and tested using a log-rank test. Multivariable Cox proportional hazards models were used to estimated hazard ratios (HR) and 95% confidence intervals (Cis) to adjust for age at diagnosis, PSA at diagnosis, race, pathological tumor stage, grade group, and surgical margins. We checked the proportional hazards assumption by plotting the log-log survival curves. A p value < 0.05 was considered statistically significant. Analyses were performed in R version 4.0.2.Immunohistochemistry for ERGERG immunohistochemistry was performed as previously described44. Briefly, four μm TMA sections were dehydrated and blocked in 0.6% hydrogen peroxide in methanol for 20 min. and were processed for antigen retrieval in EDTA (pH 9.0) for 30 min in a microwave followed by 30 min of cooling in EDTA buffer. Sections were then blocked in 1% horse serum for 40 min and were incubated with the ERG-MAb mouse monoclonal antibody developed at CPDR (9FY, Biocare Medical Inc.) at a dilution of 1:1280 for 60 min at room temperature. Sections were incubated with the biotinylated horse anti-mouse antibody at a dilution of 1:200 (Vector Laboratories) for 30 min followed by treatment with the ABC Kit (Vector Laboratories) for 30 min. The color was developed by VIP (Vector Laboratories,) treatment for 5 min, and the sections were counter stained by hematoxylin. ERG expression was reported as positive or negative. ERG protein expression was correlated with clinico-pathologic features.TCGA SNP-array dataWe analyzed data from 495 TCGA patients using the Affymetrix SNP Array 6.0 and preprocessed it with the AROMA affymetrix R package. They calculated principal components from the B-allele frequencies, finding that PC2 and PC3 distinguished samples by ancestry. The DBSCAN algorithm identified 251 Caucasian and 46 African American patients, excluding outliers (Supplementary Fig. 1). The analysis revealed a notable depletion near the centromere of chromosome 5, with a more significant loss in African American patients, particularly around the CHD1 gene (Supplementary Fig. 2). This discovery prompted further investigation into the observed genetic differences.Next-generation sequencing dataWhole exomes from 498 patients were downloaded from the GDC data portal and aligned to the GRCh38 reference genome. The samples included the following self-declared ancestries: 52 African American (nAA = 52), 387 European American (nEA = 387), 12 Asian American (nAS = 12), 1 American Native, and 46 not reported (Supplementary Fig. 6). Additionally, whole genome normal-tumor pairs from 63 patients were obtained from various sources. We acquired 20 sample pairs (nAA = 2, nEA = 18) from the ICGC data portal (TCGA PRAD-US cohort), 19 sample pairs (nAA = 9, nEA = 10) from the Dana Farber Cancer Institute, 14 sample pairs (nAA = 7, nEA = 7) from the Center for Prostate Disease Research (CPDR), and 10 sample pairs (nAA = 0, nEA = 10) from the Decker et al. study45.Evaluation of the self-declared ancestriesTo identify the ancestries of the 46 unreported cases in the TCGA whole exome cohort, we sought to determine the genotypes at key genomic SNP coordinates, which are significantly more prevalent in the three common ancestry groups (European American—EA, African American—AA, and Asian—AS)46. A Bayes Classifier was used to identify the most probable ancestries of the “not reported” cases and to detect outliers among cases with self-declared ancestries. Variants more prevalent in the three ancestries were chosen from the Exome Aggregation Consortium (ExAC) database, emphasizing those supported by at least 4000 African American donors and 10,000 Asian and European American donors. The top 1000 most common variants in each ancestry group, which were nearly absent in the other two groups, were selected as predictors. (Supplementary Figs. 7–9).The collected 3000 SNPs were used to create a single genotype matrix (G) with 498 rows (patients) and 3000 columns (genotypes). In this matrix, an element (G[i, j]) was set to 0 for REF/REF genotypes, 1 for heterozygous ALT/REF variants, and 2 for ALT/ALT homozygotes. Singular value decomposition was performed on matrix G to determine its singular values and their corresponding singular vectors, representing the principal components (PCs). The projections onto 2-dimensional planes formed by the first few principal components showed that the first principal component accounted for the largest proportion of variance and best separated African American patients from European American patients. The second principal component, while representing a smaller fraction of the variance, differentiated the Asian samples from the other two ancestries (Supplementary Figs. 10 and 11). We identified and filtered outliers based on their distances in the PC1-PC2 space, focusing on the mean distance from their 10 closest neighbors of the same ancestry. These outliers were reclassified and treated similarly to samples with “not reported” ancestries (Supplementary Figs. 12 and 13).Our approach involved training a model to learn the distribution of ancestry points in the PC1-PC2 space. We used these learned distributions to predict the likely ancestry of ‘not reported’ and ‘outlier’ cases based on their genotypes. Ancestry classes were encoded as follows; European American: 0, African American: 1, Asian American: 2.The columns of the G matrix were standardized according to:$$G{\left[i,j\right]}^{* }=\frac{G\left[i,j\right]-{E}_{k}G\left[k,j\right]}{{\sigma }_{k}\left(G\left[k,j\right]\right)}$$
(1)
The probability that sample \({{\boldsymbol{x}}}_{{\boldsymbol{i}}}={\boldsymbol{G}}\)[i,•] belongs to ancestry group \({h}_{a}\) was calculated as the following:$$P\left({y}_{i}={h}_{a}{\rm{|}}{{\boldsymbol{x}}}_{{\boldsymbol{i}}}\right)=\frac{P\left(\left.{{\boldsymbol{x}}}_{{\boldsymbol{i}}}\right|{y}_{i}={h}_{a}\right)P\left({y}_{i}={h}_{a}\right)}{{\sum }_{{h}^{{\prime} }\epsilon H}P\left({{\boldsymbol{x}}}_{{\boldsymbol{i}}}{\rm{|}}{y}_{i}={h}^{{\prime} }\right)P\left({y}_{i}={h}^{{\prime} }\right)}$$
(2)
The likelihood \(P({{\boldsymbol{x}}}_{{\boldsymbol{j}}}|{y}_{j}={h}_{a})\) is modeled using a multivariate normal density \({\mathscr{N}}({{\boldsymbol{x}}}_{{\boldsymbol{j}}}|{{\boldsymbol{\mu }}}_{{{\boldsymbol{h}}}_{{\boldsymbol{a}}}},{{\boldsymbol{\Sigma }}}_{{{\boldsymbol{h}}}_{{\boldsymbol{a}}}})\), with the maximum a posteriori (MAP) estimates of the parameters obtained from the classifier algorithm. The prior probability \(P({y}_{j}={h}_{a})\) is determined by the relative sample size of ancestry \({h}_{a}\), calculated as \(P({y}_{j}={h}_{a})={n}_{{h}_{a}}/{\sum }_{k}{n}_{{h}_{k}}\), where \({n}_{{hk}}\) represents the number of patients from ancestry \({h}_{k}\). The 441 samples from AA, EA, and AS patients were randomly split into training (ntraining = 352, 80%) and test (ntest = 89, 20%) sets. The classifier was implemented in R, and its accuracy was estimated to be within the range of [0.949, 0.999] (Supplementary Figs. 14–19). Additionally, the model trained on TCGA whole exomes was evaluated on the full cohort of whole genomes, achieving a 100% agreement between the predicted ancestries and the self-reported ancestries (Supplementary Figs. 20–22).Identification of local subclonal loss of CHD1 in prostate adenocarcinomaThe paired germline and tumor BAM files were analyzed to determine their mean sequencing depths using bedtools genomecov47 and samtools48. The coverage data around the CHD1 gene (chr5:98,853,485–98,930,272 in GRCh38 and chr5:98,190,408–98,262,740 in GRCh37) was collected in 50 bp wide bins, resulting in m-dimensional vectors (m_GRCh37 = 1447, m_GRCh38 = 1536). These vectors were normalized based on their respective mean sequencing depths. The linear relationship between the paired germline-tumor coverages were determined in the following form:$${d}_{t}={\rm{\alpha }}+{{\rm{\beta }}}_{0}{d}_{n},$$
(3)
where \({d}_{n}\) is the normalized depth of the germline sample and \({d}_{t}\) is the normalized depth of its corresponding tumor pair. The intercept (\(\alpha\)) was used to ensure that the data was free of outliers, and the slope (\({\beta }_{0}\)) was used as a raw measure of the observable loss in the tumor. Similar slopes were calculated for 14 housekeeping genes (G6PD, IPO8, PGK1, PP1A, HMBS, GUSB, UBC, YWHAZ, GAPDH, HPRT1, ACTB, B2M, TBP, and TFRC) in each sample-pair to assess the significance of the loss. The 14 estimated slopes were standardized into z-scores using their mean and standard deviation. The estimated slopes for CHD1 were also converted into z-scores based on previously determined parameters from their donors, and p values were calculated. Samples with p values greater than 0.1 for whole genomes or 0.05 for whole exomes were labeled as “CHD1 intact,” while those with lower p values were classified as “CHD1 loss”.The cellularity (c) of the tumors were estimated using sequenza49 with the most reliable cellularity-ploidy pair selected from the tool’s alternative solutions. To account for the uncertainty in the reported cellularity values, a beta distribution was fitted on the grid-approximated marginal posterior densities of c. These were used to simulate random variables to determine the proportion of the approximate loss of CHD1 in the tumors using the following formula:$${\beta }_{t} \sim \frac{{\beta }_{0}-1+c}{c},$$
(4)
which was derived from:$${{\rm{\beta }}}_{0}=\underbrace{1\cdot (1-c)}_{\mathrm{normal}\,\mathrm{contamination}}+\underbrace{c\cdot ({{\rm{\beta }}}_{t}).}_{\mathrm{contribution}\,\mathrm{from}\,\mathrm{the}\,\mathrm{tumor}}$$
(5)
The ∼ operator in Eq. (4) indicates that the true level of CHD1 can only be determined with a certain degree of accuracy, which depends on the uncertainties in β0 and c. The uncertainty in β0 arises from the fitted linear model itself:$${d}_{t}=\text{Normal}\left({\rm{\mu }},{\rm{\sigma }}\right)$$$${\rm{\mu }}={\rm{\alpha }}+{{\rm{\beta }}}_{0}{d}_{n}$$$${\rm{\alpha }}=\text{Normal}\left(\mathrm{0,5}\right)$$
(6)
$${{\rm{\beta }}}_{0}={\text{Normal}}^{+}\left(0,5\right)$$$${\rm{\sigma }}={\text{Normal}}^{+}\left(0,5\right)$$Sequenza provides the joint posterior distribution of ploidy and cellularity using a grid approximation. We sampled from the peak of the cellularity’s discretized marginal posterior, which matched the final copy number segments. To convert these discrete values to a continuous scale, a beta distribution was fitted to the cellularity samples: \(c\sim \text{Normal}\left({{\rm{\alpha }}}_{c},{{\rm{\beta }}}_{c}\right)\). Using the distributions of β₀ and cellularity, we estimated the uncertainty in the true level of CHD1 loss, calculated as \(1-{{\rm{\beta }}}_{t}.\)GenotypingVariant and copy number calling were conducted in the same manner as described by Sztupinszki et al.31. Genotypes were categorized as follows: wild type (+|+) if no pathogenic or likely pathogenic variants were found in the gene, monoallelic (+|−) if at least one pathogenic germline or somatic variant or a loss of heterozygosity (LOH) was identified, and biallelic (−|−) if a pathogenic variant was present along with an LOH or a deep deletion was observed (Supplementary Fig. 24).Local subclonal LOH-callingThe SNP variant allele frequencies (VAF) at CHD1 in the tumor were collected with GATK HaplotypeCaller50. The coverage and VAF data were carefully analyzed to ensure a strict focus on regions that have suffered the most serious loss (e.g., if only a part of the gene was lost, the unaffected regions were excluded from the analysis). Using the tumor cellularity (c) and the estimated level of loss in the tumor (\({{\rm{\beta }}}_{{\rm{t}}}\)), we evaluated whether a heterozygous or homozygous subclonal deletion was more likely responsible for the observed frequency pattern.The observed distribution of SNP ALT allele frequencies in the tumor sample (\(A{F}_{{{\rm{obs}}}}\)) were considered as stochastic variables generated by the following process:$$\begin{array}{ll}A{F}_{obs} \sim \underbrace{(1-c)\cdot A{F}_{\rm{normal}}}_{\mathrm{normal}\,\mathrm{contamination}}\,+\,{\rm{c}}\cdot \left[\underbrace{{{\rm{L}}}_{{\rm{true}}}\cdot {{\rm{AF}}}_{{\rm{normal}}}}_{\mathrm{Tumor}\,\mathrm{cell}\,\mathrm{with}\,\mathrm{normal}\,\mathrm{phenotype}}\right.\\\left.+\underbrace{(1-{L}_{\rm{true}})A{F}_{{\rm{tumor}}\,{\rm{subclone}}}}_{\mathrm{Tumor}\,\mathrm{cells}\,\mathrm{with}\,\mathrm{CHD1}\,\mathrm{loss}}\right]\end{array}$$
(7)
Here, c represents the cellularity of the tumor sample (a stochastic random variable approximated by a beta process, as described earlier), and \({L}_{{{\rm{true}}}}\) is the proportion of cancer cells with intact CHD1 in the sample, not accounting for normal contamination. \(A{F}_{{{\rm{normal}}}}\) is the distribution of allele frequencies for heterozygous SNPs in the normal sample, which is also modeled using a beta process:$$A{F}_{{{\rm{normal}}}} \sim \text{Beta}\left({{{\alpha }}}_{n},{{{\beta }}}_{n}\right),$$
(8)
centered on 0.5, i.e., \({{\rm{\alpha }}}_{n} \sim {{\rm{\beta }}}_{n}.\)When the loss in the tumor is homozygous, all the reads come from either the normal cells or the tumor cells that still have the normal phenotype, meaning they have intact CHD1. To ensure that \(\left(1-c\right)+c=1\), we assume that in this case, \(A{F}_{{{\rm{tumor}}\; {\rm{subclone}}}}=A{F}_{{{\rm{normal}}}}\). This means the observed allele frequency in the homozygous loss scenario is the same as the normal allele frequency (specifically in the vicinity of the target gene), expressed as:$$A{F}_{{{\rm{obs}}}}^{{{\rm{homozygous}}}} \sim A{F}_{{{\rm{normal}}}}.$$
(9)
In cases where there is no deletion in the targeted gene, the same allele frequency distribution is observed. The only indication of a loss in this scenario is a decrease in coverage in the tumor.A heterozygous deletion (LOH) can occur through the loss of either the ALT allele (resulting in \(A{F}_{{{\rm{tumor}}\,{\rm{subclone}}}}=0\)) or the REF allele (resulting in \(A{F}_{{{\rm{tumor}}\; {\rm{subclone}}}}1\)), and the distribution of the observable allele frequencies becomes bimodal. Equation(7) can be simplified to the following formula:$$A{F}_{{{\rm{obs}}}}={w}_{1}\cdot A{F}_{{{\rm{normal}}}}+{w}_{2}\cdot A{F}_{{{\rm{tumor}}\; {\rm{subclone}}}}$$
(10)
where \({w}_{1}=\left(1-c+c{L}_{{{\rm{true}}}}\right)\) and \({w}_{2}=c\left(1-{L}_{{{\rm{true}}}}\right)\) are stochastic variables that depend only on the cellularity and the estimated level of CHD1 loss, subject to the constraint \({\sum }_{i=1}^{2}{w}_{i}=1.\) In a heterozygous model, the observable allele frequencies will be generated by the following stochastic process:$$AF_{obs}^{LOH}\sim\frac{1}{2}\left(\underbrace{w_{1}\cdot AF_{\rm{normal}}+w_{2}}_{{\mathrm{in}}\,{\mathrm{case}}\,{\mathrm{the}}\,{\mathrm{REF}}\,{\mathrm{allele}}\,{\mathrm{is}}\,{\mathrm{lost}}} \right)+\frac{1}{2}\left(\underbrace{w_{1}\cdot AF_{\rm{normal}}}_{{\mathrm{in}}\,{\mathrm{case}}\,{\mathrm{the}}\,{\mathrm{ALT}}\,{\mathrm{allele}}\,{\mathrm{is}}\,{\mathrm{lost}}} \right).$$
(11)
The left-hand side will produce variants with higher AFs, while the right-hand side will produce lower AFs. The distance between the two modes is influenced by \({w}_{2}\). The larger \({w}_{2}\) is, the closer the modes will be to \({AF}=0\) and \({AF}=1\).The likelihoods that the data were produced by either a homozygous or heterozygous process are:$${\mathcal{L}}\left(A{F}_{{{\rm{obs}}}}|\text{het.deletion}\right)=\mathop{\prod }\limits_{i=1}^{N}\left(A{F}_{{{\rm{ob}}}{{\rm{s}}}_{i}}|A{F}_{{{\rm{obs}}}}^{{{\rm{LOH}}}}\right),$$
(12)
and$${\mathcal{L}}\left(A{F}_{{{\rm{obs}}}}|\hom. \text{deletion}\right)=\mathop{\prod }\limits_{i=1}^{N}\left(A{F}_{{{\rm{obs}}}_{i}}|A{F}_{{{\rm{obs}}}}^{{{\rm{homozygous}}}}\right).$$
(13)
The probability that the deletion affects only one of the alleles (i.e., it is heterozygous) can be calculated from the likelihoods:$$P\left(\text{het.deletion}\right)=\frac{{\mathcal{L}}\left(A{F}_{{{\rm{obs}}}}|\text{het. deletion}\right)}{{\mathcal{L}}\left(A{F}_{{obs}}|\text{het. deletion}\right){{+}}{\mathcal{L}}\left(A{F}_{{obs}}|\hom. \text{deletion}\right)}$$
(14)
This process is illustrated in Supplementary Fig. 25.Mutational signaturesSomatic point-mutational signatures were estimated with the deconstructSigs R package51. The list of considered mutational processes whose signatures’ linear combination could lead to the final mutational catalogs (a.k.a. mutational spectra) were extracted in a dynamic process in which every single signature components were investigated one by one in an iterative manner and only those were kept that have improved the cosine similarity between the reconstructed and original spectra by a considerable margin (>0.001).HRD-scoresThe calculation of the genomics scar scores (loss-of-heterozygosity: LOH, large-scale transitions: LST and number of telomeric allelic imbalances: ntAI) was performed using the scarHRD R package52. The allele-specific segmentation data of the samples were provided by sequenza49.Cell culture modelsPC-3, 22Rv1, C4-2B and DU-145 prostate cell lines were purchased from ATCC® and grown in RPMI 1640 (Gibco) supplemented with 10% FBS (Gibco). MDA-PCa-2b cells were grown in BRFF-HPC1 media (Athena Enzyme Systems #0403) supplemented with 20% FBS (Gibco) and growing surface was coated with FNC coating mix (Athena Enzyme Systems #0407). All the cell lines were grown at 37 °C in 5% CO2, and regularly tested negative for Mycoplasma spp. contamination. The CRISPR edited CHD1 deficient LNCaP cell lines were generously shared by the authors13.Stable CRISPR-Cas9 expressing isogenic PC-3 cell line generationFull length SpCas9 ORF was introduced in PC-3 cell population by Lentiviral transduction using lentiCas9-Blast (Addgene #52962) construction. After antibiotics (blasticidin) selection, survival populations were single cell cloned, isogenic cell lines were generated and tested for Cas9 activity by cleavage assay.Gene knock-out inductionCHD1 was targeted in CRISPR-Cas9 expressing PC-3 cell line using guide RNA CHD1_ex2_g1 (gCTGACTGCCTGATTCAGATC), resulted PC-3 CHD1 ko 1, and CHD1 ko 2 homozygous knock out cell lines. The same guide RNA was used to transiently knock out CHD1 gene in the 22Rv1 parental cell line.TransfectionCells were transiently transfected by Nucleofector® 4D device (Lonza) by using supplemented, Nucleofector® SF solution and 20 μl Nucleocuvette® strips following the manufacturer’s instructions. Following transfection, cells were resuspended in 100 μl culturing media and plated in 1.5 ml pre-warmed culturing media in a 24 well tissue culture plate. Cells were subjected to further assays 72 h post transfection.In vitro T7 endonuclease I (T7E1) assayTemplates used for T7E1 were amplified by PCR using CGTCAACGATGTCACTAGGC forward and ATGATTTGGGGCTTTCTGCT reverse oligos generating a 946 bp amplicon. In total, 500 ng PCR products were denatured and reannealed in 1x NEBuffer 2.1 (New England Biolabs) using the following protocol: 95 °C, 5 min; 95–85 °C at −2 °C/s; 85–25 °C at −0.1 °C/s; hold at 4 °C. Hybridized PCR products were then treated with 10 U of T7E1 enzyme (New England Biolabs) for 30 min in a reaction volume of 30 μl. Reactions were stopped by adding 2 μl 0.5 M EDTA, fragments were visualized by agarose gel electrophoresis.Generating of SPOPF102C mutant overexpressing PC cell lines. SPOPF102C ORF was previously cloned into pInducer20 (Addgene #44012)53 vector and overexpressed in PC-3 and 22Rv1 wt and CHD1 knock out cells by lentiviral transduction. After G418 (500 ug/ml) antibiotics selection survival populations were propagated and utilized for further assays. Using 48 h doxycycline (0.5 ug/ul) induction, olaparib sensitivity assay was performed. Endogenous wt SPOP and mutant SPOPF102C protein levels were determined SPOP specific (Abcam) and HA-tag (Sigma-Aldrich) antibodies, respectively.Immunoblot analysisFreshly harvested cells were lysed in RIPA buffer. Protein concentrations were determined by Pierce BCATM Protein Assay Kit (Pierce). Proteins were separated via Mini Protean TGX stain free gel 4–15% (BioRad) and transferred to polyvinylidene difluoride membrane by using iBlot 2 PVDF Regular Stacks (Invitrogen) and iBlot system transfer system (Life Technologies).Membranes were blocked in 5% BSA solution (Sigma). Primary antibodies were diluted following the manufacturer’s instructions: anti-Vinculin antibody (Cell Signaling) (1:1000) and antiCHD1 (Novus Biologicals) (1:2000).Signals were developed by using Clarity Western ECL Substrate (BioRad) and Image Quant LAS4000 System (GE HealthCare).Proximity ligation assay (PLA)Cells were seeded in μ-slide 8 well chambers (Ibidi GmbH, Germany) and incubated overnight. Next day, cells were subjected to irradiation (4 Gy). Irradiated and control cells (0 Gy) were recovered for 3 h, then fixed with 4% PFA and permeabilized with 0.3% Triton X-100.Duolink® Proximity Ligation Assay (Sigma) was carried out using antibodies against γH2Ax and RAD51(Cell Signaling) according to the manufacturer’s instruction. Signals were detected by fluorescent microscopy (Nikon Ti2-e Live Cell Imaging System). Quantification of fluorescent signals were carried out by using the Fiji-ImageJ software.Sample preparation for whole genome sequencing (WGS)DNA was extracted from 22Rv1 and PC-3 CHD1 knock out isogenic cell lines at low passage number of the cells (22Rv1_1, PC-3_1). Following 45 passages, CHD1 knock out isogenic cell line was single cell cloned, and two colonies per cell line (22Rv1_2, 22Rv1_3, PC-3_2, PC-3_3) were propagated for DNA isolation.DNA was extracted by using QIAamp DNA Mini Kit (QIAGENE). Whole Genome Sequencing of the DNA samples was carried out at Novogene service company.Viability cell proliferation assaysExponentially growing PC-3 cell lines WT, CHD1 ko1, CHD1 ko2, and 22Rv1 WT and chd1 ko respectively, were seeded in 96-well plates (1500 PC-3 cells/well, and 3000 22 Rv1 cells/well) and incubated for 36 h to allow cell attachment. Identical cell numbers of seeded parallel isogenic lines were verified by the Celigo Imaging Cytometer after attachment. C4-2B, MDA-PCa-2b and DU145 cells were transiently transfected with Ctrl siRNA (5’-CGUACGCGGAAUACUUCGAUUUU-3’) and CHD1 siRNA (5’-CACAAGAGCUGGAGGUCUAUU-3’) using RNAiMAX (Invitrogen, 13778-150) according to the manufacturer’s instructions. Cells were exposed to talazoparib (Selleckchem) and olaparib (MedChemExpress) for 24 h, then kept in drug-free fresh media for 5 days until cell growth was determined by the addition of PrestoBlueTM (Invitrogen) and incubated for 2.5 h or with CellTiter-Glo (Promega, #G7572). Cell viability was determined by using the BioTek plate reader system. Fluorescence was recorded at 560 nm/590 nm, and values were calculated based on the fluorescence intensity. IC50 values were determined by using the AAT Bioquest IC50 calculator tool. p values were calculated using Student’s t test. p values < 0.05 were considered statistically significant.NGS analysis of the PC-3 and 22Rv1 whole genomes sequencesThe reads of the six WGS (3 PC-3 and 3 22Rv1) were aligned to the grch37 reference genome using the bwa-mem54 aligner. The resulting bam files were post-processed according to the GATK best-practices guidelines. Novel variants were called using Mutect2 (v4.1.0) by using CHD1 intact WGS references downloaded from the Sequence Read Archive (SRA, with accession IDs; PC-3: SRX5466646, 22Rv1: SRX5437595) as “normal” and the CHD1 ko clones as “tumor” specimens50. These vcfs were converted into tab-delimited files and further analyzed in R. Annotation was performed via Intervar55.

Hot Topics

Related Articles