||
原创小程序-让你的引物设计又快又准!
这次我们按照计划继续我们的生信菜鸟养成记(以往的推送文章请看文章末尾列表),今天已经来到了这个系列的第六集,我们也开始上一些code。这些code由Jorge实验室张军利博士开发,专治各种引物设计的疑难杂症,本人已亲测,非常好用,起码比PolyMarker又快又准!
其实今天以下的主要内容也都来自军利兄在一次小麦基因克隆的workshop中教大家设计引物的课件,经军利兄允许,今天无私奉献给大家!想到当前无论是设计引物的软件还是自己编写的小程序均是以英文为主,所以今天的推送也就给大家原汁原味的用英文奉上。另外,对今天的推送中一些基本原理有什么问题请参见我们上次生信系列的推送:小麦生信菜鸟归来(五)—系列总结以及特异性引物设计。
Steps to design genome-specific primers
以下这六步其实就是特异性引物设计的核心,其实所有软件和方法都是根据这六步开发的。我们不仅要会用,更要知其所以然。
1. Blast the marker sequence against the pseudomolecule andfind all the
homeologsand potentially paralogs: >90% similarity
2. Extract the sequences for all the homeologs and potentially paralogs
3. Multiple Sequence Alignment
4. Find all the variation sites among the homeologs and paralogs
5. Use variation sites or combination of variation sites that are unique to
yourtargets to design primers
6. Blast all the primers against the pseudomolecule v1.0 with word length 7 to see whether they also hit other chromosomes
Common practices of PCR primers
· Length: 18 - 25 nt
· Melting temperature: around 60 °C
· GC clamp: G or C bases within the last five basesin the 3' end helps promote specific binding, but more than 3 G/C should beavoided
· NO secondary structures
· Avoid template secondary structure or othercomplex regions, such as retros
· Amplicon length: KASP and dCAPS are short (<300 bp), other markers usually < 1 kb
· Primer pair Tm difference < 5 °C
Primer Design Tips
以下中第二点不知有多少小伙伴知道,这个思路很巧妙,用过的都说好!
1. Usethe unique variation site as the primer 3' end
7A CGAGCTTGATGACGAAGAAGGAT
7B CGAGCTTGATGACGAAGAAGGAC
2. Two variation sites in the first 4 nt from the 3' end: we canintroduce 1
mutation in the 3rd nt from 3' end (may need to use touchdown PCR)
7A CGAGCTTGATGACGAAGAAGGAT
7B CGAGCTTGATGACGAAGAAGGAC
CGAGCTTGATGACGAAGAAGAAT
Nucleotide substitution principle:
A → C;T → C;G → A;C → T
Validate Primer Location Using Chinese Spring Nullitetrasomic (NT) Lines
If we tested our primers target for 7A:
N7AT7D (7D7B7D): Absent
N7BT7D (7A7D7D): Present
N7DT7B (7A7B7B): Present
Are our primers 7A-specific?
Common PCR-based genotyping methods for SNP markers
以下的三种标记种类大家可能熟悉CPAS和KASP,不知有多少小伙伴熟悉dCAPS, 这个和上面第二点所用到的思路是一样的。
1. CAPS (Cleaved Amplified Polymorphic Sequences)
One SNP allele creates or removes a naturally occurring restriction site
Codominant
2. dCAPS (Derived CAPS)
For SNPs that donot create a natural restriction site
Uses introduced mismatches in one PCR primer to create a restriction site forone allele
Codominant
3. KASP (Kompetitive Allele Specific PCR)
A homogenous,fluorescence-based genotyping variant of PCR
Codominant
为了让大家熟悉这个dCAPs,以下是一个例子:
IWB1998:CGAGCTTGATGACGAAGAAGGAGA[T/C]CGGGCAGACCCACGACGT
EcoRV: GAT'ATC
这里又有一个巧妙的思路:We can add some tails to make dCAPS primer longer to better separate after digestion:GAAGGTGACCAAGTTCATGCTCGAGCTTGATGACGAAGAAGGATA
Primer design software
· Primer3 (http://primer3.ut.ee/)
· Polymarker for KASP in wheat (https://github.com/TGAC/biorubypolyploid tools)
· CAPS Designer (https://solgenomics.net/tools/caps_designer/caps_input.pl)
· dCAPS Finder 2.0 (http://biology4.wustl.edu/dcaps)
· indCAPS (http://indcaps.kieber.cloudapps.unc.edu/)
· GSP (Genome Specific Primers) (https://probes.pw.usda.gov/GSP/)
SNP Primer Design Pipeline
这个就是军利兄自己编写的Python小程序,以下也附上了在github上的源代码和相应说明文件。强烈推荐大家下载应用,说明文件也非常详细,学起来也不难。
1. Apipeline to design KASP/CAPS/dCAPS primers for SNPs in wheat
2. A Python script which incorporates:
· Muscle: Multiple sequence alignment program
(http://www.drive5.com/muscle/)
· Primer3: program for designing PCR primers
(http://primer3.sourceforge.net/)
· blast+: BLAST the wheat genome
(https://blast.ncbi.nlm.nih.gov/Blast.cgi)
3. I have a github repository for this tool
https://github.com/pinbo/SNP_Primer_Pipeline
下面是Pipeline的工作原理:
1.Blast each SNP sequence against the pseudomolecule and get hits that are
· >90% similarity and
· >90% of length of the best hit AND >50 bp
2.Get 500 bps on each side of the SNP for all the hits (SNP is at 501)
3. Multiple Sequence Alignment of the homeolog sequences with MUSCLE
4. Find all the variation sites that can differ the target from other homeologs
5. Use these sites as forced 3' end in Primer3 and design homeolog specific primers
6. Blast all the primers against the pseudomolecule v1.0 with word length 7 to see whether it also hits other chromosomes
· Criterion of matches: < 2 mismatches in thefirst 4 bps from 3'
引申阅读:小麦生信系列文章
1. 第一篇是为序,介绍了一个常用的生信网站Graingenes
2. 接下来三篇主要介绍了小麦物理图谱的介绍和应用,其中对小麦基因组数据库的总结介绍是非常基础且重要的知识。另外,也介绍了一些比较基因组学的知识和应用,包括野生二粒小麦,山羊草,拟南芥,和水稻。
3. 接下来几篇小编打算介绍三个主题:基因表达,特异性引物设计,以及突变体库。
小麦生信菜鸟(六)— 特异性引物设计(今天推送)
小麦生信菜鸟(七)— 基因表达数据分析(待续)
小麦生信菜鸟(八)— 小麦突变体数据库介绍及应用(待续)
4.其它相关推送:
欢迎关注“小麦研究联盟”,了解小麦新进展投稿、转载、合作以及信息发布等请联系:wheatgenome
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-12-27 09:23
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社