博文

libsvm

已有 4339 次阅读 2014-3-31 11:37 |个人分类:data mining|系统分类:科研笔记

Libsvm有两个参数需要用户指定：c和gamma。实际上在LibSVM中用户需要给出一个c和gamma的区间，LibSVM采用交叉验证cross-validation accuracy的方法确定分类效果最好的c和gamma。

举个例子说明什么是交叉验证，假如把训练样本集拆成三组，然后拿 1 跟 2 来 train model 并 predict 3 以得到正确率；再来拿 2 跟 3 train 并 predict 1 ，最后 1,3 train 并 predict 2 ，最后取预测精度最高的那组c和gamma。

测试过程：libsvm文件夹下有一个heart_scale测试文件

（1）安装python2.7.6

（2）下载libsvm，进入到tools文件夹下，修改grid.py

（3）gnuplot_exe = r"c:tmpgnuplotbinpgnuplot.exe" 修改gunplot路径

(4) 进入命令行，运行 python grid.py ../heart_scale

运行结束之后，会得到预测精度最高的c和gamma，也就是等高线最高处那时的参数。

但是，仍旧需要进一步调整。

grid.py -log2c 9.25,12.75,0.25 -log2g -11.25,-14.75,-0.25 ../heart_scale

libsvm /windows

生成model的命令：svm-train -c 2048 -g 0.000244140625 ../heart_scale 会得到一个预测model

进行预测的命令：svm-predict ../heart_scale heart_scale.model heart_scale.out

SVM scale：对数据进行规范化（v-min）/(max-min)

svm-scale -l 0 -u 1 -s german.range german.train >german.train.scale

svm-scale l 0 -u 1 -r german.range german.test > german.test.scale

-b 1是SVM option（具有概率值的预测比没有的结果好）

grid.py -b 1 -v 10 ../heart_sacle

svm-train -b 1 -c g 0.03125 ../heart_scale

svm -predict -b 1 ../heart-scale heart_scale.model

options: -s svm_type : set type of SVM (default 0) 0 -- C-SVC 1 -- nu-SVC 2 -- one-class SVM 3 -- epsilon-SVR 4 -- nu-SVR -t kernel_type : set type of kernel function (default 2) 0 -- linear: u'*v 1 -- polynomial: (gamma*u'*v + coef0)^degree 2 -- radial basis function: exp(-gamma*|u-v|^2) 3 -- sigmoid: tanh(gamma*u'*v + coef0) -d degree : set degree in kernel function (default 3) -g gamma : set gamma in kernel function (default 1/num_features) -r coef0 : set coef0 in kernel function (default 0) -c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1) -n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5) -p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1) -m cachesize : set cache memory size in MB (default 100) -e epsilon : set tolerance of termination criterion (default 0.001) -h shrinking: whether to use the shrinking heuristics, 0 or 1 (default 1) -b probability_estimates: whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0) -wi weight: set the parameter C of class i to weight*C, for C-SVC (default 1) The k in the -g option means the number of attributes in the input data.

To install this tool, please read the README file in the package. There are Windows, X, and Java versions in the package.

参考：

http://www.csie.ntu.edu.tw/~cjlin/libsvm/

http://www.cnblogs.com/zhangchaoyang/articles/2189606.html

http://baike.baidu.com/link?url=TnWbAaLRe3McQljIQAflAkqyX4ZNZ_B_6X2RoYvJsQrNFKTP7Ts1oVa7xXcoGbxT

转载本文请联系原作者获取授权，同时请注明本文来自吕璐成科学网博客。
链接地址：https://wap.sciencenet.cn/blog-780964-780705.html

上一篇：数据挖掘——分类
下一篇：ETL工具Kettle

收藏 IP: 168.160.23.*| 热度|

当前推荐数：0

该博文允许注册用户评论请点击登录评论 (0 个评论)

数据加载中...

返回顶部

博文发布时间已经超过87600小时，评论已关闭。

lucheng918的个人博客分享 http://blog.sciencenet.cn/u/lucheng918

博文

libsvm

当前推荐数：0

该博文允许注册用户评论请点击登录评论 (0 个评论)

吕璐成

全部作者的精选博文

全部作者的其他最新博文

全部精选博文导读

相关博文

lucheng918的个人博客分享 http://blog.sciencenet.cn/u/lucheng918

博文

libsvm

当前推荐数：0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

吕璐成

全部作者的精选博文

全部作者的其他最新博文

全部精选博文导读

相关博文

该博文允许注册用户评论请点击登录评论 (0 个评论)