yuanhui80的个人博客分享 http://blog.sciencenet.cn/u/yuanhui80

博文

GPU加速的Gromacs 4.5.1 进行分子动力学模拟

已有 12717 次阅读 2011-6-4 11:56 |系统分类:科研笔记| 并行, gromacs, 分子动力学模拟

    计算机技术的快速发展,算法及相应软件的不断更新,使得当前利用我们手上的普通电脑来模拟相对较大的生物大分子体系和多聚体分子成为了可能,而且这种趋势会越来越明显,尤其是多核心CPU的出现及分子动力学模拟大规模并行化计算能力的提高,让从事生物学研究的人们有可能利用手中的个人电脑对感兴趣的蛋白分子进行有目的的模拟,并充分与生物学实验有机地结合在一起,这是一件非常有意思和好玩的事情。
    尽管如此,许多生物大分子体系还是非常巨大的,比如我的一个体系:腺病毒六邻体(Hexon)三聚体蛋白,我想对其进行温控的分子动力学模拟,以动态分析其总表位构成、高温变性机制及高温变性在免疫原性上的反应。该体系共含有约940*3 = 2820个氨基酸残基,再加上一个立方体的水盒子,总体系约200,000个原子数,在QX9650四核心CPU,64位Linux系统下,Gromacs每模似10ns的时间要花上约27天,非常耗时,在一个约500ns的总体设计中,这种计算能力是无法忍受的。
    GPU加速的Gromacs为我们带来了非常振奋的好消息,官方称利用Nvidia的CUDA技术,可以将MD模拟提高原单CPU的十倍以上,以下是我利用Nvidia GTX460 2G 进行分子动力学模拟的全过程,现拿出来和大家进行分享。
 
第一天:
Nvida GTX460 2G大显存
按gmx网站(www.gromacs.org/gpu)上的说明,可以模拟200,000左右的原子,我模拟的体系则好是190,000

硬件:Dell T3400工作站 X38主板 CPU: QX9650 4G ECC内存 GTX460 2G显存
软件:Ubuntu9.10 64位, CUDA3.1, OpenMM2.0, FFTW3.2.2, CMake, Gromacs 4.5.1,
按照官网要求,独立安装CPU的Gromacs4.5.1(CMake编译),再下载预编译好的 mdrun-gpu beta2 for Ubuntu9.10 64位
设好环境变量,运行~

但是运行后,提示:
Fatal error:
reading tpx file (md.tpr) version 73 with version 71 program
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
大概的意思是说:预编译好的mdrun-gpu跑不了由4.5.1版 grompp程序生成的tpr文件。
 
第二天:
采取从头编译的方法解决了上述问题,因为预编译好的mdrun-gpu与4.5.1里的程序版本号不同,所以会出现不兼容现象,
按照提示,顺利编译4.5.1版的mdrun-gpu成功,
——————————————————————————

export OPENMM_ROOT_DIR=path_to_custom_openmm_installation
cmake -DGMX_OPENMM=ON [-DCMAKE_INSTALL_PREFIX=desired_install_path]
make mdrun
make install-mdrun
——————————————————————————

但是新的问题来了,
运行出现错误提示:
mdrun-gpu: error while loading shared libraries: libopenmm_api_wrapper.so: cannot open shared object file: No such file or directory
很奇怪!
环境变量也设好了,没有问题
在openmm目录下找不到libopenmm_api_wrapper.so文件

 
第三天:
我将操作系统换成RHEL5.5系统
再利用相同的安装方法,顺利解决上述问题,
但不明白其中原因,不过我想还是有办法解决的(先不管它)!
 
第四天:
总结一下我所遇到的问题,及解决办法:
1,版本问题
——————————————————————————
Fatal error:
reading tpx file (md.tpr) version 73 with version 71 program
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
这里是说版本不兼容

2,Openmm不支持多组的温度耦合
——————————————————————————
Fatal error:
OpenMM does not support multiple temperature coupling groups.
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors

3,不能按以前的mpd来设置
——————————————————————————
Fatal error:
OpenMM uses a single cutoff for both Coulomb and VdW interactions. Please set rcoulomb equal to rvdw.
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors

4,GPU加速的gmx现在只支持amber力场及charmm力场
——————————————————————————
Fatal error:
The combination rules of the used force-field do not match the one supported by OpenMM: sigma_ij = (sigma_i + sigma_j)/2, eps_ij = sqrt(eps_i * eps_j). Switch to a force-field that uses these rules in order to simulate this system using OpenMM.

5,GPU加速的gmx不支持G96里的 interaction ,实际上还是力场问题
——————————————————————————
Fatal error:
OpenMM does not support (some) of the provided interaction type(s) (G96Angle)
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors

6,在Ubuntu9.10里面用cmake编译gromacs4.5.1会遇到找不到libopenmm_api_wrapper.so文件的问题,换成RHEL5.5可以解决
——————————————————————————
error while loading shared libraries: libopenmm_api_wrapper.so: cannot open shared object file: No such file or directory
 
第五天:
mdrun-gpu终于跑起来了,mdp文件是用的官网提供的 bench里面的,不过还是有一些warning:

It is also possible to optimize the transforms for the current problem by performing some calcula-
tions at the start of the run. This is not done by default since it takes a couple of minutes, but for
large runs it will save time. Turn it on by specifying
optimize_fft = yes


WARNING: OpenMM does not support leap-frog, will use velocity-verlet integrator.

WARNING: OpenMM supports only Andersen thermostat with the md/md-vv/md-vv-avek integrators.


Pre-simulation ~15s memtest in progress...done, no errors detected
starting mdrun 'Protein in water'
1000000 steps, 2000.0 ps.

NODE (s) Real (s) (%)
Time: 33.080 99.577 33.2
(Mnbf/s) (MFlops) (ns/day) (hour/ns)
Performance: 0.000 0.074 47.609 0.504

gcq#330: "Go back to the rock from under which you came" (Fiona Apple)
————————————————————————————————
最后这个Performance,有点看不懂,单从(ns/day) 这一点看,性能是l四核心CPU的五倍,但实际运行,性能仅是CPU的2倍,
(MFlops) 一项,竟然是 0.074
CPU的 (MFlops) 是12GFlops

总体上看,GPU加速的GMX,性能提升,至少可以达到传统四核心CPU的2倍,
imp模型官网上说可达到10倍以上,继续更新中。。。。。。

 
第六天:
190,000个原子的体系,共设了10ns,performance显示是5 ns/day,理论上两天就算完了,
实际上得到28号才算完(10月18号下午1点开始),这个结果和performance明显不符~~~

5000000 steps, 10000.0 ps.
step 417300, will finish Thu Oct 28 10:39:59 2010 /10月18号开始,显示10月28号结束

Received the TERM signal, stopping at the next step

step 417378, will finish Thu Oct 28 10:39:46 2010
Post-simulation ~15s memtest in progress...done, no errors detected

NODE (s) Real (s) (%)
Time: 13633.960 71173.931 19.2
3h47:13
(Mnbf/s) (MFlops) (ns/day) (hour/ns)
Performance: 0.000 0.003 5.290 4.537

gcq#47: "I Am Testing Your Grey Matter" (Red Hot Chili Peppers)
 
第七天:
相同的体系,相同的设置,以下是用QX9650 四核心CPU 跑的performance:

性能虽然比GTX460弱,但也只是多算了三天时间
——————————————————————————————
Back Off! I just backed up md.trr to ./#md.trr.1#

Back Off! I just backed up md.edr to ./#md.edr.1#

WARNING: This run will generate roughly 3924 Mb of data

starting mdrun 'Good gRace! Old Maple Actually Chews Slate in water'
5000000 steps, 10000.0 ps.
step 0
NOTE: Turning on dynamic load balancing

step 500, will finish Mon Nov 1 09:57:48 2010vol 0.74 imb F 2% /10月19号早上开始,显示的是11月1号结束

Received the TERM signal, stopping at the next NS step

step 550, will finish Mon Nov 1 10:08:34 2010
Average load imbalance: 2.4 %
Part of the total run time spent waiting due to load imbalance: 1.1 %
Steps where the load balancing was limited by -rdd, -rcon and/or -dds: Y 0 %


Parallel run - timing based on wallclock.

NODE (s) Real (s) (%)
Time: 123.856 123.856 100.0 2:03
(Mnbf/s) (GFlops) (ns/day) (hour/ns)
Performance: 156.395 11.835 0.769 31.220

gcq#358: "Now it's filled with hundreds and hundreds of chemicals" (Midlake)
 
第八天:
跑官网上的bench:
GTX460 的成绩是102ms/day,与c2050并没有想象的那么大差距!

Pre-simulation ~15s memtest in progress...done, no errors detected
starting mdrun 'Protein'
-1 steps, infinite ps.
step 285000 performance: 102.1 ns/day

Received the TERM signal, stopping at the next step

step 285028 performance: 102.1 ns/day
Post-simulation ~15s memtest in progress...done, no errors detected

NODE (s) Real (s) (%)
Time: 481.290 482.224 99.8
8:01
(Mnbf/s) (MFlops) (ns/day) (hour/ns)
Performance: 0.000 0.002 102.335 0.235
 
总结:新一代GPU加速的Gromacs分子动力学模拟为我们展示了GPU将来在分子动力学领域应用美好前景,但目前还不成熟。从以上测试中我们可以看出在隐性溶剂水模型的MD模拟计算中,GPU加速的计算性能是传统四核心CPU的至少10倍以上,但是在显性溶剂水模型中,GPU加速未见得多么明显;另外最为重要的一点是,当前版本Gromacs 4.5.1 对于GPU加速MD计算有很多限制,如支持力场有限,许多特性还不支持,模拟的可重复性差(与CPU模拟相比)等,不过在足够长的模拟时间下,还是会生成重复性较好的具有统计学意义的模拟轨迹。相关资料请参考:www.gromacs.org/GPU


https://wap.sciencenet.cn/blog-571741-451527.html

上一篇:纳米抗体(Nanobody)研究进展
收藏 IP: 222.129.28.*| 热度|

1 焦豹

发表评论 评论 (0 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-5-8 09:42

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部