lucheng918的个人博客分享 http://blog.sciencenet.cn/u/lucheng918

博文

libsvm

已有 4339 次阅读 2014-3-31 11:37 |个人分类:data mining|系统分类:科研笔记

Libsvm有两个参数需要用户指定:cgamma。实际上在LibSVM中用户需要给出一个cgamma的区间,LibSVM采用交叉验证cross-validation accuracy的方法确定分类效果最好的cgamma。

举个例子说明什么是交叉验证,假如把训练样本集拆成三组,然后拿 1 2 train model predict 3 以得到正确率; 再来拿 2 3 train predict 1 ,最后 1,3 train predict 2 ,最后取预测精度最高的那组c和gamma。

测试过程:libsvm文件夹下有一个heart_scale测试文件

(1)安装python2.7.6

(2)下载libsvm,进入到tools文件夹下,修改grid.py

(3)gnuplot_exe = r"c:tmpgnuplotbinpgnuplot.exe"  修改gunplot路径

(4) 进入命令行,运行  python grid.py ../heart_scale


运行结束之后,会得到预测精度最高的c和gamma,也就是等高线最高处那时的参数。

但是,仍旧需要进一步调整。

grid.py -log2c 9.25,12.75,0.25 -log2g -11.25,-14.75,-0.25 ../heart_scale


libsvm /windows

生成model的命令:svm-train -c 2048 -g 0.000244140625 ../heart_scale  会得到一个预测model

进行预测的命令:svm-predict ../heart_scale heart_scale.model heart_scale.out


SVM scale:对数据进行规范化 (v-min)/(max-min)

svm-scale -l 0 -u 1 -s german.range german.train >german.train.scale

svm-scale l 0 -u 1 -r german.range german.test > german.test.scale


-b 1是SVM option(具有概率值的预测比没有的结果好)

grid.py -b 1 -v 10 ../heart_sacle

svm-train -b 1 -c g 0.03125 ../heart_scale

svm -predict -b 1 ../heart-scale heart_scale.model



options: -s svm_type : set type of SVM (default 0) 0 -- C-SVC 1 -- nu-SVC 2 -- one-class SVM 3 -- epsilon-SVR 4 -- nu-SVR -t kernel_type : set type of kernel function (default 2) 0 -- linear: u'*v 1 -- polynomial: (gamma*u'*v + coef0)^degree 2 -- radial basis function: exp(-gamma*|u-v|^2) 3 -- sigmoid: tanh(gamma*u'*v + coef0) -d degree : set degree in kernel function (default 3) -g gamma : set gamma in kernel function (default 1/num_features) -r coef0 : set coef0 in kernel function (default 0) -c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1) -n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5) -p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1) -m cachesize : set cache memory size in MB (default 100) -e epsilon : set tolerance of termination criterion (default 0.001) -h shrinking: whether to use the shrinking heuristics, 0 or 1 (default 1) -b probability_estimates: whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0) -wi weight: set the parameter C of class i to weight*C, for C-SVC (default 1) The k in the -g option means the number of attributes in the input data.

To install this tool, please read the README file in the package. There are Windows, X, and Java versions in the package.



参考:

http://www.csie.ntu.edu.tw/~cjlin/libsvm/

http://www.cnblogs.com/zhangchaoyang/articles/2189606.html

http://baike.baidu.com/link?url=TnWbAaLRe3McQljIQAflAkqyX4ZNZ_B_6X2RoYvJsQrNFKTP7Ts1oVa7xXcoGbxT




https://wap.sciencenet.cn/blog-780964-780705.html

上一篇:数据挖掘——分类
下一篇:ETL工具Kettle
收藏 IP: 168.160.23.*| 热度|

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...
扫一扫,分享此博文

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-5-22 07:17

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部