小柯机器人

深度分布式计算可用来重建大型谱系树
2022-01-09 14:09

日本东京大学Nozomu Yachie研究团队表明,深度分布式计算可用来重建大型谱系树。这一研究成果于2022年1月6日在线发表在国际学术期刊《自然—生物技术》上。

研究人员提出了一个深度分布式计算框架来全面追踪准确的大型谱系(FRACTAL),该方法大大增强了当前谱系估计软件工具的可扩展性。FRACTAL首先只重建输入序列的上游谱系,然后使用独立的计算节点对其下游谱系进行递归迭代,从而产生同样的结果。研究人员通过从>2.35亿个模拟序列和>1600万个细胞中重建谱系来证明了FRACTAL的实用性,这些细胞来自一个在细胞增殖期间积累突变的CRISPR系统的模拟实验。研究人员还成功地将FRACTAL应用于进化树的重建和使用易错PCR(EP-PCR)进行大规模序列多样化的实验。

据介绍,系统发育估计(进化树的重建)最近被应用于基于CRISPR的细胞系追踪,其能够从体细胞的大量突变序列中推断出单个组织或生物体的发育历史。然而,目前的计算方法无法从极大量的输入序列中构建系统发育树。

附:英文原文

Title: Deep distributed computing to reconstruct extremely large lineage trees

Author: Konno, Naoki, Kijima, Yusuke, Watano, Keito, Ishiguro, Soh, Ono, Keiichiro, Tanaka, Mamoru, Mori, Hideto, Masuyama, Nanami, Pratt, Dexter, Ideker, Trey, Iwasaki, Wataru, Yachie, Nozomu

Issue&Volume: 2022-01-06

Abstract: Phylogeny estimation (the reconstruction of evolutionary trees) has recently been applied to CRISPR-based cell lineage tracing, allowing the developmental history of an individual tissue or organism to be inferred from a large number of mutated sequences in somatic cells. However, current computational methods are not able to construct phylogenetic trees from extremely large numbers of input sequences. Here, we present a deep distributed computing framework to comprehensively trace accurate large lineages (FRACTAL) that substantially enhances the scalability of current lineage estimation software tools. FRACTAL first reconstructs only an upstream lineage of the input sequences and recursively iterates the same produce for its downstream lineages using independent computing nodes. We demonstrate the utility of FRACTAL by reconstructing lineages from >235 million simulated sequences and from >16 million cells from a simulated experiment with a CRISPR system that accumulates mutations during cell proliferation. We also successfully applied FRACTAL to evolutionary tree reconstructions and to an experiment using error-prone PCR (EP-PCR) for large-scale sequence diversification. Cell lineage tracing is scaled up to hundreds of millions of simulated sequences with distributed computing.

DOI: 10.1038/s41587-021-01111-2

Source: https://www.nature.com/articles/s41587-021-01111-2

Nature Biotechnology:《自然—生物技术》,创刊于1996年。隶属于施普林格·自然出版集团,最新IF:68.164
官方网址:https://www.nature.com/nbt/
投稿链接:https://mts-nbt.nature.com/cgi-bin/main.plex


本期文章:《自然—生物技术》:Online/在线发表

分享到:

0