how test CPU performance - CentOS 7

mikexrv

New Member
Feb 28, 2020
15
0
1
hi,
I would verify CPU performance on the CentOS 7. I use HPC software on the cluster with four compute nodes (total 96 cores).

I ran the analysis on all cores, which should load the system 100% but when log in to each compute nodes then command top give me strange information.
For example the max CPU on 'compute-node-01' is 361% where the max CPU on the 'compute-node-02' is 702%. I expect 2400% on the each node because each node has two Xeon Gold 6136 where each of them has 12 cores.

Is there any option or settings in CentOS which could cause this kind of problems? I used the same HPC application on the many different clusters and always got maximum performance. Or maybe other tool which could stress CPU?

I appriciate for any suggestions.

Regards,
Michal
 

MBastian

Active Member
Jul 17, 2016
135
32
28
Düsseldorf, Germany
Verify with some synthetic load generator. E.g. stress out of the epel repository.
Check the relevant BIOS setting
Look if tuned is active.
Check which cpu governor is active.
 

mikexrv

New Member
Feb 28, 2020
15
0
1
Verify with some synthetic load generator. E.g. stress out of the epel repository.
I tried use stress but this tool use one process to load one core. It means 24 process is used to load each core with 100% load.
Is there any tool on the market which use one process to load all cores?

Check the relevant BIOS setting
I will verify settings tomorrow.

Look if tuned is active.
is activated

Check which cpu governor is active.
CPUPOWER_START_OPTS="frequency-set -g performance"
CPUPOWER_STOP_OPTS="frequency-set -g ondemand"
 

MBastian

Active Member
Jul 17, 2016
135
32
28
Düsseldorf, Germany
I tried use stress but this tool use one process to load one core. It means 24 process is used to load each core with 100% load.
Is there any tool on the market which use one process to load all cores?
Stress is pretty versatile. Have a look at the manpage.

Code:
stress --cpu $(cat /proc/cpuinfo | grep ^processor| wc -l)
 
Last edited:

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,372
488
83
I ran the analysis on all cores, which should load the system 100% but when log in to each compute nodes then command top give me strange information.
For example the max CPU on 'compute-node-01' is 361% where the max CPU on the 'compute-node-02' is 702%. I expect 2400% on the each node because each node has two Xeon Gold 6136 where each of them has 12 cores.
Without knowing how your software behaves, it's difficult for us to make a call on this. As MBastian says, it'd be a good idea to verify CPU performance with software that everyone can compare to. Is the hardware on all nodes the same? Same CPU and motherboard topology, SMT on in both systems, both BIOS and OS set to the same power-saving settings?

Outside of that there's a wealth of synthetic and real-world benches you can try using. I usually use ffmpeg for testing myself but if you've got pxz installed here's a fairly easy method to max out my 16 threads by compressing data from /dev/urandom:
Code:
cat /dev/urandom|pxz -T 16 -cv - > /dev/null
Load looks like this in htop once it's amassed enough data to occupy all 16 threads:
Code:
  0  [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]   8  [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]
  1  [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]   9  [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]
  2  [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]   10 [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]
  3  [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]   11 [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]
  4  [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]   12 [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]
  5  [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]   13 [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]
  6  [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]   14 [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]
  7  [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]   15 [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]
  Mem[|||||||||||||||||||||||||||||||||||||||||             3.79G/62.8G]   Tasks: 121, 107 thr, 282 kthr; 16 running
  Swp[|                                                     3.25M/2.00G]   Load average: 8.27 2.54 0.88
                                                                           Uptime: 201 days(!), 01:59:42

  PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
22282 effrafax   20   0 2522M 1872M  1904 R 1600  2.9 11:51.53 pxz -T 16 -cv -
22287 effrafax   20   0 2522M 1872M  1904 R 104.  2.9  0:44.46 pxz -T 16 -cv -
22295 effrafax   20   0 2522M 1872M  1904 R 102.  2.9  0:44.47 pxz -T 16 -cv -
22297 effrafax   20   0 2522M 1872M  1904 R 102.  2.9  0:44.33 pxz -T 16 -cv -
22289 effrafax   20   0 2522M 1872M  1904 R 102.  2.9  0:44.43 pxz -T 16 -cv -
22293 effrafax   20   0 2522M 1872M  1904 R 102.  2.9  0:44.49 pxz -T 16 -cv -
22286 effrafax   20   0 2522M 1872M  1904 R 102.  2.9  0:44.30 pxz -T 16 -cv -
22290 effrafax   20   0 2522M 1872M  1904 R 100.  2.9  0:44.50 pxz -T 16 -cv -
22292 effrafax   20   0 2522M 1872M  1904 R 100.  2.9  0:44.20 pxz -T 16 -cv -
22298 effrafax   20   0 2522M 1872M  1904 R 100.  2.9  0:44.51 pxz -T 16 -cv -
22284 effrafax   20   0 2522M 1872M  1904 R 100.  2.9  0:44.37 pxz -T 16 -cv -
22285 effrafax   20   0 2522M 1872M  1904 R 100.  2.9  0:44.47 pxz -T 16 -cv -
22296 effrafax   20   0 2522M 1872M  1904 R 100.  2.9  0:44.36 pxz -T 16 -cv -
22288 effrafax   20   0 2522M 1872M  1904 R 100.  2.9  0:44.52 pxz -T 16 -cv -
22291 effrafax   20   0 2522M 1872M  1904 R 100.  2.9  0:44.30 pxz -T 16 -cv -
22294 effrafax   20   0 2522M 1872M  1904 R 98.4  2.9  0:44.15 pxz -T 16 -cv -
22283 root       20   0  8844  4748  3416 R  2.0  0.0  0:01.10 htop
Unlike stress, this will show you the parent process; as you can see from the above, it's able to hit 1600% CPU without issue and htop gives you a nice visual indication of what each CPU is up to, so it's relatively easy to spot whether your system's hitting the limits or not.

Code:
stress --cpu $(cat /proc/cpuinfo | grep ^processor| wc -l)
Useless use of cat award! ;)
Code:
grep ^processor /proc/cpuinfo|wc -l
...is a cleaner way of saying the command substitution. Since it's just grabbing the total number of CPUs though you might want to consider using:
Code:
nproc --all
 
  • Like
Reactions: MBastian

mikexrv

New Member
Feb 28, 2020
15
0
1
I did as you suggested, I checked the following settings on all nodes and each node has the same:

  • CPU and motherboard topology,
  • SMT on
  • BIOS and OS set to the same power-saving settings
Additionally installed pxz and ran below command which gave me full CPU load.
Code:
cat /dev/urandom|pxz -T 24 -cv - > /dev/null
after all this, I turned my attention to the software that I use on a daily basis and finally found the reason. There was wrong settings for I_MPI_FABRICS.

thank you for pointing me to the right track.
 

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,372
488
83
Glad you found the culprit, and even more glad it wasn't a hardware problem and just some software settings :)