repDilPCR: a tool for automated analysis of qPCR assays by the dilution-replicate method | BMC Bioinformatics

The workflow is summarized in Fig. 1B. repDilPCR has been designed with the philosophy to automate and speed up data analysis. Once the input data have been uploaded in the correct format, the user can achieve all of the following with just a few clicks and within 1–2 min:

Impute missing Cq values for reference genes (using the weighted predictive mean matching method from the R package mice [10]),

Perform multiple linear regressions to get standard curves and Cq-Cq plots for all amplicons (based on Eq. 3 and 5 from the original article describing the dilution-replicate approach [8]),

Identify possible outliers,

Calculate relative quantities of the templates,

Perform statistical tests to compare experimental groups,

Prepare publication-ready plots (as in Fig. 1C),

And download the results in a suitable format: Comma-Separated Values (CSV), Portable Document Format (PDF) or Portable Network Graphics (PNG).

Preparation of input dataThis preparatory step is the same no matter whether one intends to use the R script or the Shiny app.Input data have to be arranged in a CSV file following a specific format depending on the experimental setup and type of data: (a) unprocessed Cq values obtained from an experiment performed according to the dilution-replicate approach, or (b) already calculated relative expression values. Exemplary input data tables for these two use cases are provided in the files Test_data.csv and Test_data_precalc.csv, respectively, which are available in the installation directory or can be downloaded using the buttons on the “About/Help” tab of the repDilPCR program. In the exemplary files, points are used as decimal separators and commas as field separators (to separate values in each row). It is also possible to use commas as decimal separators and semicolons as field separators—the default regional setting in most European countries. The program will recognize the format automatically.Input data consisting of unprocessed Cq values (dilution-replicate approach). It is crucial that a common threshold has to be set for all genes that are being compared in an experiment before exporting the Cq values from the software of the qPCR machine. This is necessary because of the assumptions of the mathematical model derived in the original article and implemented in repDilPCR (see Additional file 1 for a brief summary). Depending on the manufacturer of the machine and the respective software, Cq values might be referred to as Ct (“cycle threshold”) or Cp (“crossing point”) values but these different names stand for the same concept. Here, we adhere to the Minimum Information for Publication of Quantitative Real-Time PCR Experiments (MIQE) guidelines and the respective terminology (Cq = quantification cycle) [1]. The CSV file needs to have the following layout: The first row contains column titles. The first three columns have predetermined names that must not be changed. The first column is called “Replicates” and it should contain the names of the samples with a suffix that identifies the biological replicate. The suffix consists of an underscore (“_”) plus additional numbers and/or letters. The second column is called “Pairs” and can contain optional information about grouping of samples in pairs. The third column is called “Dilution” and contains the dilution factors according to the dilution-replicate design. For example, if the experiment was performed with fivefold serial dilutions, one can use as factors the numbers 1, 5 and 25. The following columns should contain the Cq values for the assessed genes, first the reference genes (RG) and then the genes of interest (GOI). The titles of these columns should be the names of the respective genes/amplicons. See the Supplementary Information (Additional file 1) for further details.Input data consisting of relative expression values. In this case, the CSV file that has to be prepared has a simpler layout. Again, the first row contains column titles but now only the first two columns are obligatory and with predetermined names that must not be changed: “Replicates” and “Pairs”. Their specification is the same as in the case when Cq values are used (see above). The next columns should contain the relative expression levels (linearly scaled) of the evaluated genes of interest in each biological replicate. Accordingly, the titles of these columns should be the respective gene/amplicon names. See the Supplementary Information (Additional file 1) for further details.Usage of the Shiny appThe Shiny app can be used via any modern web browser. Users can:

Access a publicly available Shiny server with repDilPCR installed on it, for example the installation hosted at the German Cancer Research Center (DKFZ) in Heidelberg (https://repdilpcr.eu), or

Issue the following commands in the R environment:

library(shiny)runApp(“~/repDilPCR/app.R”, launch.browser = TRUE)replacing the “ ~ /repDilPCR” part with the actual path to their installation, if deviating. This will launch the program and automatically start a new browser window or tab to access it.The workflow includes the following steps:

1.

Upload of properly formatted data

2.

Selection of reference genes and (optionally) imputation of missing Cq values (this whole step is only relevant when working with Cq values. Users of the imputation function should read chapter 3.2.2 of Additional file 1 and keep in mind that imputation of too many missing values may lead to erroneous results.)

3.

Data analysis

4.

Checking the results of the regression analysis (only relevant when working with Cq values, see Additional file 1: Figs. S2 and S3)

5.

Visualization of the results. Different types of plots will be available in the graphical interface depending on the chosen settings (Additional file 1: Figs. S4–S8). Possible choices are “Dot plots (all points)”, “Dot plots (means and standard deviations)”, “Bar graphs (means and standard deviations)” and “Box plots”. Graphical parameters like font size, colour scheme, significance symbols, spacing of significance bars, size and resolution of images can be adjusted from the control panel.

6.

Statistical tests. repDilPCR aims to make the process of testing statistical hypotheses easy even for users without much knowledge of statistics by automatically selecting appropriate statistical tests depending on the context and properties of the data. The user can choose the broad type of statistical test (parametric or non-parametric) and the comparisons to be tested for statistically significant differences (“all to one (all to reference)”, “all pairs” and “selected pairs”) by clicking on the respective radio buttons in the control panel. The significance level (α) can be freely selected. To make usage of parametric tests possible, all statistical tests are performed on logarithmically transformed data, even when the user chooses to display plots in linear scale (qPCR data are not normally distributed on a linear scale [11]). Comparisons for which the expression of a given gene of interest is significantly different between the groups will be automatically denoted by p-values or asterisks depending on the user’s choice. The statistical tests that were performed in each particular case will be listed below the respective plot.

7.

Downloading results. All plots and tables that repDilPCR produces can be downloaded from the “Download results” tab. It has three subtabs: “Plots”, “Tables” and “Intermediate data”. Plots can be downloaded in the PDF or PNG file format. PDF files will be multi-page, meaning that the plots for all genes of interest will be put together in a single file on separate pages. Conversely, each PNG file will contain a single plot (gene) but all plots of a particular type will be grouped together and downloaded as a single ZIP archive. In all cases, downloaded files will have automatically created informative file names that will include the name of the dataset (uploaded data file) and the plot type. Additionally, plots in logarithmic scale will have “log” in their file names. Tables will be downloaded as CSV files.

Further details on each of the steps are given in the Supplementary Information (Additional file 1: Chapter 3.2).Usage of the R scriptUsers with experience in R might prefer to use the script due to the more streamlined workflow: one just has to specify the path to the input data, set preferences for the analysis and then execute the script. All results will be automatically saved in the same directory as the raw data without the need to click around in a graphical interface and to download result files one by one. Detailed description is available in the Supplementary Information (Additional file 1: Chapter 3.3).

Hot Topics

Related Articles