姚程
Fluent 并行核数对比
2022-11-1 23:18
阅读:3984

测试平台:2颗 epyc 7742  Fluent2020

网格12w,density based,开启energy,计算161 迭代步

核数
时间/迭代步 [s/iter]
10
0.124
20
0.068
40
0.023
60
0.018
80
0.016

可见,核数越多,越快。

但是,如果同时开启2个以上fluent,

速度会相互影响。


Performance Timer for 161 iterations on 10 compute nodes
  Average wall-clock time per iteration:              0.124 sec
  Global reductions per iteration:                      101 ops
  Global reductions time per iteration:               0.000 sec (0.0%)

  Message count per iteration:                         2284 messages

  Data transfer per iteration:                        2.981 MB

  LE solves per iteration:                                3 solves
  LE wall-clock time per iteration:                   0.035 sec (28.5%)
  LE global solves per iteration:                         1 solves
  LE global wall-clock time per iteration:            0.000 sec (0.2%)
  LE global matrix maximum size:                        62
  AMG cycles per iteration:                           3.000 cycles
  Relaxation sweeps per iteration:                      226 sweeps
  Relaxation exchanges per iteration:                     0 exchanges
  LE early protections (stall) per iteration:           0.000 times
  LE early protections (divergence) per iteration:      0.000 times
  Total SVARS touched:                              369
  Time-step updates per iteration:                     0.31 updates
  Time-step wall-clock time per iteration:            0.002 sec (1.9%)

  Total wall-clock time:                             19.916 sec
Performance Timer for 330 iterations on 20 compute nodes
  Average wall-clock time per iteration:              0.068 sec
  Global reductions per iteration:                      101 ops
  Global reductions time per iteration:               0.000 sec (0.0%)
  Message count per iteration:                         7687 messages
  Data transfer per iteration:                        5.265 MB
  LE solves per iteration:                                3 solves
  LE wall-clock time per iteration:                   0.020 sec (28.7%)
  LE global solves per iteration:                         1 solves
  LE global wall-clock time per iteration:            0.000 sec (0.6%)
  LE global matrix maximum size:                       192
  AMG cycles per iteration:                           3.000 cycles
  Relaxation sweeps per iteration:                      212 sweeps
  Relaxation exchanges per iteration:                     0 exchanges
  LE early protections (stall) per iteration:           0.000 times
  LE early protections (divergence) per iteration:      0.000 times
  Total SVARS touched:                              369
  Time-step updates per iteration:                     0.30 updates
  Time-step wall-clock time per iteration:            0.001 sec (2.1%)

  Total wall-clock time:                             22.477 sec
Performance Timer for 161 iterations on 40 compute nodes
  Average wall-clock time per iteration:              0.023 sec
  Global reductions per iteration:                      101 ops
  Global reductions time per iteration:               0.000 sec (0.0%)
  Message count per iteration:                        13744 messages
  Data transfer per iteration:                        7.996 MB
  LE solves per iteration:                                3 solves
  LE wall-clock time per iteration:                   0.008 sec (33.1%)
  LE global solves per iteration:                         1 solves
  LE global wall-clock time per iteration:            0.001 sec (4.0%)
  LE global matrix maximum size:                       602
  AMG cycles per iteration:                           3.000 cycles
  Relaxation sweeps per iteration:                      200 sweeps
  Relaxation exchanges per iteration:                     0 exchanges
  LE early protections (stall) per iteration:           0.000 times
  LE early protections (divergence) per iteration:      0.000 times
  Total SVARS touched:                              369
  Time-step updates per iteration:                     0.31 updates
  Time-step wall-clock time per iteration:            0.001 sec (2.6%)

  Total wall-clock time:                              3.732 sec
Performance Timer for 161 iterations on 60 compute nodes
  Average wall-clock time per iteration:              0.018 sec
  Global reductions per iteration:                      101 ops
  Global reductions time per iteration:               0.000 sec (0.0%)
  Message count per iteration:                        22381 messages
  Data transfer per iteration:                       10.313 MB
  LE solves per iteration:                                3 solves
  LE wall-clock time per iteration:                   0.006 sec (34.2%)
  LE global solves per iteration:                         1 solves
  LE global wall-clock time per iteration:            0.001 sec (5.6%)
  LE global matrix maximum size:                       612
  AMG cycles per iteration:                           3.000 cycles
  Relaxation sweeps per iteration:                      197 sweeps
  Relaxation exchanges per iteration:                     0 exchanges
  LE early protections (stall) per iteration:           0.000 times
  LE early protections (divergence) per iteration:      0.000 times
  Total SVARS touched:                              369
  Time-step updates per iteration:                     0.31 updates
  Time-step wall-clock time per iteration:            0.001 sec (3.1%)

  Total wall-clock time:                              2.843 sec
Performance Timer for161 iterations on 80 compute nodes
  Average wall-clock time per iteration:              0.016 sec
  Global reductions per iteration:                      101 ops
  Global reductions time per iteration:               0.000 sec (0.0%)
  Message count per iteration:                        31017 messages
  Data transfer per iteration:                       12.356 MB
  LE solves per iteration:                                3 solves
  LE wall-clock time per iteration:                   0.007 sec (41.4%)
  LE global solves per iteration:                         1 solves
  LE global wall-clock time per iteration:            0.001 sec (7.4%)
  LE global matrix maximum size:                       623
  AMG cycles per iteration:                           3.000 cycles
  Relaxation sweeps per iteration:                      199 sweeps
  Relaxation exchanges per iteration:                     0 exchanges
  LE early protections (stall) per iteration:           0.000 times
  LE early protections (divergence) per iteration:      0.000 times
  Total SVARS touched:                              369
  Time-step updates per iteration:                     0.31 updates
  Time-step wall-clock time per iteration:            0.001 sec (3.3%)

  Total wall-clock time:                              2.631 sec



转载本文请联系原作者获取授权,同时请注明本文来自姚程科学网博客。

链接地址:https://wap.sciencenet.cn/blog-531760-1361900.html?mobile=1

收藏

分享到:

当前推荐数:0
推荐到博客首页
网友评论0 条评论
确定删除指定的回复吗?
确定删除本博文吗?