小柯机器人

检测种系大片段缺失和插入的可靠基准
2020-06-17 20:37

美国国家标准技术研究所Justin M. Zook研究组近日取得一项新成果。他们开发出了一种检测种系大片段缺失和插入的可靠基准。这一研究成果发表在2020年6月15日出版的《自然-生物技术》杂志上。

为了帮助研究基因组结构变异(SV)方法转化为常规研究和临床实践,他们开发了一种序列解析基准集,可用于识别假阴性和假阳性种系的大片段插入和缺失。为了建立个人基因组计划三人组中一个广泛认同基准,Bottle Consortium中的基因组整合了19种来自多种技术的序列解析变异调用方法,该三人组具有广泛可用的细胞和DNA。

最终的基准测试集包含≥50个碱基对(bp)的12,745个分离的,经序列解析的插入(7,281)和缺失(5,464)调用。第1层基准区域(任何额外的调用均为假定的假阳性)涵盖了≥1个二倍体组装所支持的2.51 Gbp和5,262插入和4,095缺失。他们证明,该基准测试集可从短、相连和长片段测序和光学映射中可靠地识别出高质量SV调用集中的假阴性和假阳性。

据介绍,新技术和分析方法使人们能够以不断提高的准确、分辨率和全面性检测基因组SV。

附:英文原文

Title: A robust benchmark for detection of germline large deletions and insertions

Author: Justin M. Zook, Nancy F. Hansen, Nathan D. Olson, Lesley Chapman, James C. Mullikin, Chunlin Xiao, Stephen Sherry, Sergey Koren, Adam M. Phillippy, Paul C. Boutros, Sayed Mohammad E. Sahraeian, Vincent Huang, Alexandre Rouette, Noah Alexander, Christopher E. Mason, Iman Hajirasouliha, Camir Ricketts, Joyce Lee, Rick Tearle, Ian T. Fiddes, Alvaro Martinez Barrio, Jeremiah Wala, Andrew Carroll, Noushin Ghaffari, Oscar L. Rodriguez, Ali Bashir, Shaun Jackman, John J. Farrell, Aaron M. Wenger, Can Alkan, Arda Soylev, Michael C. Schatz, Shilpa Garg, George Church, Tobias Marschall, Ken Chen, Xian Fan, Adam C. English, Jeffrey A. Rosenfeld, Weichen Zhou, Ryan E. Mills, Jay M. Sage, Jennifer R. Davis, Michael D. Kaiser, John S. Oliver, Anthony P. Catalano, Mark J. P. Chaisson, Noah Spies, Fritz J. Sedlazeck, Marc Salit

Issue&Volume: 2020-06-15

Abstract: New technologies and analysis methods are enabling genomic structural variants (SVs) to be detected with ever-increasing accuracy, resolution and comprehensiveness. To help translate these methods to routine research and clinical practice, we developed a sequence-resolved benchmark set for identification of both false-negative and false-positive germline large insertions and deletions. To create this benchmark for a broadly consented son in a Personal Genome Project trio with broadly available cells and DNA, the Genome in a Bottle Consortium integrated 19 sequence-resolved variant calling methods from diverse technologies. The final benchmark set contains 12,745 isolated, sequence-resolved insertion (7,281) and deletion (5,464) calls ≥50 base pairs (bp). The Tier 1 benchmark regions, for which any extra calls are putative false positives, cover 2.51 Gbp and 5,262 insertions and 4,095 deletions supported by ≥1 diploid assembly. We demonstrate that the benchmark set reliably identifies false negatives and false positives in high-quality SV callsets from short-, linked- and long-read sequencing and optical mapping.

DOI: 10.1038/s41587-020-0538-8

Source: https://www.nature.com/articles/s41587-020-0538-8

Nature Biotechnology:《自然—生物技术》,创刊于1996年。隶属于施普林格·自然出版集团,最新IF:68.164
官方网址:https://www.nature.com/nbt/
投稿链接:https://mts-nbt.nature.com/cgi-bin/main.plex


本期文章:《自然—生物技术》:Online/在线发表

分享到:

0