Best practices for benchmarking germline small-variant calls in human genomes.

Overview

abstract

Standardized benchmarking approaches are required to assess the accuracy of variants called from sequence data. Although variant-calling tools and the metrics used to assess their performance continue to improve, important challenges remain. Here, as part of the Global Alliance for Genomics and Health (GA4GH), we present a benchmarking framework for variant calling. We provide guidance on how to match variant calls with different representations, define standard performance metrics, and stratify performance by variant type and genome context. We describe limitations of high-confidence calls and regions that can be used as truth sets (for example, single-nucleotide variant concordance of two methods is 99.7% inside versus 76.5% outside high-confidence regions). Our web-based app enables comparison of variant calls against truth sets to obtain a standardized performance report. Our approach has been piloted in the PrecisionFDA variant-calling challenges to identify the best-in-class variant-calling methods within high-confidence regions. Finally, we recommend a set of best practices for using our tools and evaluating the results.

authors

Krusche, Peter

Trigg, Len

Boutros, Paul C

Mason, Christopher E
De La Vega, Francisco M
Moore, Benjamin L
Gonzalez-Porta, Mar
Eberle, Michael A
Tezak, Zivana
Lababidi, Samir
Truty, Rebecca
Asimenos, George
Funke, Birgit
Fleharty, Mark
Chapman, Brad A
Salit, Marc
Zook, Justin M

publication date

March 11, 2019

published in

Nature biotechnology Journal

Research

keywords

Benchmarking
Exome
Genome, Human
High-Throughput Nucleotide Sequencing

Identity

PubMed Central ID

PMC6699627

Scopus Document Identifier

85062854104

Digital Object Identifier (DOI)

10.1101/101378

PubMed ID

30858580

Additional Document Info

has global citation frequency

130

volume

37

issue

5

VIVO Weill Cornell Medical College

Best practices for benchmarking germline small-variant calls in human genomes. Academic Article

Overview

abstract

authors

publication date

published in

Research

keywords

Identity

PubMed Central ID

Scopus Document Identifier

Digital Object Identifier (DOI)

PubMed ID

Additional Document Info

has global citation frequency

volume

issue