altAFplotter: a web app for reliable UPD detection in NGS diagnostics | BMC Bioinformatics

The identification of UPDs and their classification as isodisomy, heterodisomy, mixed or segmental iso-and heterodisomy can be achieved by examination of ROHs and inheritance patterns per chromosome. A batch evaluation of positive controls from previously described cases [4] and our patient cohort (largely western europeans) of ca. 9000 large panel and exome sequencing samples [5] was used to determine cutoffs for chromosome flagging. These samples have been mostly processed according to GATK best practices at the time. Cutoffs for whole genome sequencing data might differ and will be adjusted in a future version.The cutoffs for flagging were selected to ensure highly sensitive detection (27/27 positive controls are detected, see Fig. 1A–C) at the cost of increased false positives (2% in our exome cohort analysis, excluding consanguinous individuals). As the tool is designed for case by case evaluation with manual inspection, we reason that a high sensitivity is the appropriate approach for a diagnostic setting. For the same reason, we recommend using unfiltered or very moderately quality filtered vcfs. Too stringent quality filtering can favor homozygous variants, which would decrease the sensitivity for ROH-detection. The cutoffs for consanguinity detection are also based on our cohort and were chosen to separate those families for which consanguinity was reported.
Fig. 1Determination of cutoffs, each data point represents the respective metric (ROH coverage or inheritance ratio) on one chromosome.  “Cohort” refers to an evaluation of ~  6200 whole exomes (a subset of the entire cohort: only whole exomes, no consanguinous patients), “iso- and heterodisomy”, “isodisomy” and “heterodisomy” refers to positive controls used to defined cutoffs. A ROH coverage cutoffs as defined for isodisomies (0.7, orange line) and mixed UPDs (0.2, red line). Heterodisomies can not be identified based on ROHs. B inheritance ratios for trio analyses, shown here is the ratio of maternal over paternal variants. The cutoff (2, red line) includes all positive controls and can be used to identify all three types of UPDs. C For duo-analyses, the ratio of maternal over non-maternal or paternal over non-paternal variants can be used to identify UPDs. The cutoff (5, orange line) is chosen to include all positive controlsChromosome flagging is informed by the applied method, ROH detection and inheritance ratio

Runs of homozygosity: flags are applied, if the chromosome is covered by > 70% (roh_high) or between 20 and 70% (roh_mixed) ROHs. Applicable for both, single and multi-sample analyses.

Inheritance ratio: this value describes the ratio between maternal and paternal variants and vice versa in trio setups. For duos (index and one parent) it describes the ratio between maternal and not-maternal or paternal and not-paternal variants. Cutoffs here were chosen as follows: >2 for trios and > 5 for duos. For these chromosomes the flag inh_ratio_high is applied. This allows the detection of iso- and heterodisomies and the identification of the parental origin.

Consanguinity: if more than three chromosomes exceed a ROH coverage of 10% per patient, the flag consanguinity_likely is applied. Such cases can not be reliably evaluated by this approach and require carefull manual inspection.

Insufficient SNVs: if per chromosome less than 200 SNVs are present, the chromosome is excluded from analysis and the flag insufficient_snvs is applied.

For interpretation of chromosome flags and their various combinations, Table 1 can be consulted. Large deletions also lead to longer ROHs, therefore an additional hint is given to the user to check for those, if a ROH-flag is applied. For this reason and as a general recommendation, validation of UPDs by a second method is strongly advised. Besides the flagging, detailed and interactive plots are available to allow for investigation of affected chromosomes (see Figure S1).
Table 1 Interpretation guidelines for chromosome specific UPD-flags as shown in the web-appThere are some limitations for the analysis of vcf files:

Size: in the current iteration, we limit the size of vcf files to 200 MB/file to ensure rapid on demand processing. This limits the usage of unfiltered whole genome sequencing data. Future versions will support slimmer data formats such as .baf-files.

Panels: ROHs tend to be overestimated, if the number of variants is too low. We found 200 variants per chromosome to allow reliable detection of real isodisomies and thus disallow the analysis of chromosomes with less than 200 SNVs to prevent ROH-calling artefacts. Therefore Panels analysed must have a minimum size to allow reliable UPD detection.

Hot Topics

Related Articles