博文

Counting Point Mutations

已有 1774 次阅读 2015-11-4 20:54 |系统分类:科研笔记

Counting Point Mutations

Evolution as a Sequence of Mistakesclick tocollapse

Figure 1. A point mutation in DNA changinga C-G pair to an A-T pair.

A mutation is simply a mistake that occursduring the creation or copying of a nucleic acid, in particular DNA. Becausenucleic acids are vital to cellular functions, mutations tend to cause a rippleeffect throughout the cell. Although mutations are technically mistakes, a veryrare mutation may equip the cell with a beneficial attribute. In fact, themacro effects of evolution are attributable by the accumulated result ofbeneficial microscopic mutations over many generations.

The simplest and most common type ofnucleic acid mutation is a point mutation, which replaces one base with anotherat a single nucleotide. In the case of DNA, a point mutation must change thecomplementary base accordingly; see Figure 1.

Two DNA strands taken from differentorganism or species genomes are homologous if they share a recent ancestor;thus, counting the number of bases at which homologous strands differ providesus with the minimum number of point mutations that could have occurred on theevolutionary path between the two strands.

We are interested in minimizing the numberof (point) mutations separating two species because of the biological principleof parsimony, which demands that evolutionary histories should be as simplyexplained as possible.

Problem

Figure 2. The Hamming distance betweenthese two strings is 7. Mismatched symbols are colored red.

Given two strings s and t of equal length,the Hamming distance between s and t, denoted dH(s,t), is the number ofcorresponding symbols that differ in s and t. See Figure 2.

Given: Two DNA strings s and t of equallength (not exceeding 1 kbp).

Return: The Hamming distance dH(s,t).

Sample Dataset

GAGCCTACTAACGGGAT

CATCGTAATGACGGCCT

Sample Output

针对以上案例，选用以下简短代码：

#!/usr/bin/python

s1='GAGCCTACTAACGGGAT'

s2='CATCGTAATGACGGCCT'

i=0

j=0

c=0

while 0<=i<len(s1):

while 0<=j<len(s2):

if s1[i]!=s2[j]:

c+=1

else:

c=c

i+=1

j=i

print c

转载本文请联系原作者获取授权，同时请注明本文来自顾海丰科学网博客。
链接地址：https://wap.sciencenet.cn/blog-2887147-933374.html

上一篇：Counting GC content
下一篇：Translating RNA into Protein

收藏 IP: 159.226.67.*| 热度|

当前推荐数：0

该博文允许注册用户评论请点击登录评论 (0 个评论)

数据加载中...

返回顶部

顾海丰

扫一扫，分享此博文

haiferry的个人博客分享 http://blog.sciencenet.cn/u/haiferry

博文

Counting Point Mutations

当前推荐数：0

该博文允许注册用户评论请点击登录评论 (0 个评论)

顾海丰

全部作者的其他最新博文

全部精选博文导读

相关博文

haiferry的个人博客分享 http://blog.sciencenet.cn/u/haiferry

博文

Counting Point Mutations

当前推荐数：0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

顾海丰

全部作者的其他最新博文

全部精选博文导读

相关博文

该博文允许注册用户评论请点击登录评论 (0 个评论)