|||
1. CAFE,参考https://github.com/younglululu/CAFE
2. AlignmentFree,参考https://github.com/GeniusTang/AlignmentFree
此软件由Tang et al.编写,据说比CAFE快40倍左右,所以安装尝试一下
下面是在Ubuntu16.04中的安装:
(1) AlignmentFree程序是在python3的基础上编写的,需要python3的三个包:numpy, scipy, sklearn-learn。推荐使用Anaconda3安装,若没有安装Anaconda3,则先安装Anaconda3,若已安装,则可跳过此部分
首先下载Anaconda3,安装时最新版本为Anaconda3-5.3.0-Linux-x86_64.sh wget https://repo.anaconda.com/archive/Anaconda3-5.3.0-Linux-x86_64.sh 下载完成后进入到Anaconda3-5.3.0-Linux-x86_64.sh文件所在的路径下,安装 sudo bash Anaconda3-5.3.0-Linux-x86_64.sh 根据提示按回车继续(包括阅读接受许可协议,指定安装路径,等待安装完成) 注意最后提示你是否要将Anaconda的安装路径添加到PATH环境变量中,输入yes 若没有添加成功,则需要自己手动添加: export PATH="/home/anaconda3/bin:$PATH" source ~/.bashrc 最后输入python检查安装是否成功,若python对应的版本为anaconda,则为成功。
(2) 使用conda安装需要的三个包:
conda install numpy scipy scikit-learn
(3) 将AlignmentFree程序代码复制到自己的目录中
git clone https://github.com/GeniusTang/AlignmentFree.git
此处使用git命令,若未安装此命令,则使用apt install git安装。
(4) 安装程序
上一步会自动建立一个名为AlignmentFree的文件夹,指向该目录
cd ./AlignmentFree
安装程序
CC=g++ python setup.py install --install-platlib=./src/
此程序的具体用法如下:
usage: alignmentfree.py [-h] [-a METHOD] -k K [-m M] [-f FILENAME] [-s SEQUENCE_FILE] [-f1 FILENAME1] [-f2 FILENAME2] [-s1 SEQUENCE_FILE_1] [-s2 SEQUENCE_FILE_2] [-d DIR] [-o OUTPUT] [-t THREADS] [-r] [--BIC]
各参数代表的含义为
-h, --help show this help message and exit -a METHOD A list of alignment-free method, separated by comma: d2star,d2shepp,CVtree,Ma,Eu,d2 -k K Kmer length -m M Markovian Order, required for d2star, d2shepp and CVtree -f FILENAME A file that lists the paths of all samples, cannot be used together with -f1, -f2, -s, -s1, -s2 -s SEQUENCE_FILE A fasta file that lists the sequences of all samples, cannot be used together with -f, -f1, -f2, -s1, -s2 -f1 FILENAME1 A file that lists the paths of the first group of samples, must be used together with -f2, cannot be used together with -f, -s, -s1, -s2 -f2 FILENAME2 A file that lists the paths of the second group of samples, must be used together with -f1, cannot be used together with -f, -s, -s1, -s2 -s1 SEQUENCE_FILE_1 A fasta file that lists the sequences of the first group of samples, must be used together with -s2, cannot be used together with -f, -f1, -f2, -s -s2 SEQUENCE_FILE_2 A fasta file that lists the sequences of the second group of samples, must be used together with -s1, cannot be used together with -f, -f1, -f2, -s -d DIR A directory that saves kmer count -o OUTPUT Prefix of output (defualt: Current directory) -t THREADS Number of threads -r Count the reverse complement of kmers (default: False) --BIC Use BIC to estimate the Markovian orders of sequences
需要注意的一点是此程序需要的内存较大,k=10时为虚拟机分配5G的内存才没有报错,但速度真的很快。
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-9-27 11:57
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社