
2020-12-09 13:07

德国杜塞尔多夫海因里希·海涅大学Tobias Marschall和美国华盛顿大学Evan E. Eichler研究组合作,使用单细胞链测序和长读取实现了亲本数据非依赖的全阶段人基因组组装。2020年12月7日出版的《自然-生物技术》发表了这项成果。


该方法具有组装准确(质量值> 40)且高度连续(contig N50> 23 Mbp)、转换错误率低(0.17%)、并可提供了全相单核苷酸变体、插入缺失和结构变体等优势。通过与牛津纳米孔技术公司和太平洋生物科学公司的分阶段组装进行比较,研究人员确定了154个重叠断裂优先位点区域,而这与测序技术或定相算法无关。



Title: Fully phased human genome assembly without parental data using single-cell strand sequencing and long reads

Author: David Porubsky, Peter Ebert, Peter A. Audano, Mitchell R. Vollger, William T. Harvey, Pierre Marijon, Jana Ebler, Katherine M. Munson, Melanie Sorensen, Arvis Sulovari, Marina Haukness, Maryam Ghareghani, Peter M. Lansdorp, Benedict Paten, Scott E. Devine, Ashley D. Sanders, Charles Lee, Mark J. P. Chaisson, Jan O. Korbel, Evan E. Eichler, Tobias Marschall

Issue&Volume: 2020-12-07

Abstract: Human genomes are typically assembled as consensus sequences that lack information on parental haplotypes. Here we describe a reference-free workflow for diploid de novo genome assembly that combines the chromosome-wide phasing and scaffolding capabilities of single-cell strand sequencing1,2 with continuous long-read or high-fidelity3 sequencing data. Employing this strategy, we produced a completely phased de novo genome assembly for each haplotype of an individual of Puerto Rican descent (HG00733) in the absence of parental data. The assemblies are accurate (quality value > 40) and highly contiguous (contig N50 > 23 Mbp) with low switch error rates (0.17%), providing fully phased single-nucleotide variants, indels and structural variants. A comparison of Oxford Nanopore Technologies and Pacific Biosciences phased assemblies identified 154 regions that are preferential sites of contig breaks, irrespective of sequencing technology or phasing algorithms.

DOI: 10.1038/s41587-020-0719-5

Source: https://www.nature.com/articles/s41587-020-0719-5

Nature Biotechnology:《自然—生物技术》,创刊于1996年。隶属于施普林格·自然出版集团,最新IF:68.164

