Tesla P100 x 8 Linpack testing on SYS-4028GR-TXRT

dhenzjhen · Oct 20, 2017

System: Supermicro SYS-4028GR-TXRT
Motherboard: X10DG0-T
CPU: E5 2699V4 x 2
MEM: 32GB Micron x 12
BIOS: 5/25/17
GPU: Nvidia Tesla P100 SXM2 x 8
OS: Ubuntu 16.04 x64
Driver: 384.81
CUDA: version 9

PigLover · Oct 20, 2017

Now you are just showing off

Seriously - impressive.

dhenzjhen · Oct 20, 2017

PigLover said:
Now you are just showing off

Seriously - impressive.

Tested P100 while am at it before switching the V100 tray

Lukas Goe · Jan 15, 2018

Hello,

I found your result here during my research and I am very interested in how exactly you achieved it. I have similar hardware, but my Gflops can't even get close to yours. I would be very grateful, if you could give me some additional informations or maybe even post your config files (HPL.dat) here.

Which CUDA-Linpack version are you using? The only one I found and use seems rather old: hpl-2.0_FERMI_v15.

I work on a cluster with 7 gpu nodes, each node got the following hardware:

2x Intel Xeon E5-2640 v4
8x DDR4-2400 8 GB Memory
Intel X10DGQ Board
4x Tesla P100 16GB HBM2

CUDA 9.0

The best I could achieve yet was roughly 3500 Gflops - with all 28 GPUs.. I think the benchmark isn't using the GPUs at all because nvidia-smi shows barely any usage (~45W/300W, 0% GPU-Util, ~2400 MiB Mem) and those 14 Xeons should be able to get close to 3500 Gflops on their own as far as I know. There is no warning or error whatsoever and everything always ends with PASSED.

I would be very happy about any advice.

Patrick · Jul 1, 2018

Hey @dhenzjhen on all of the SXM2 servers, the V100 needs a different tray because NVLink is 300gb/s instead of 80gb/s on the P100 variants, correct?

Can the P100 GPUs be used in the V100 300gb/s tray?

Search

Tesla P100 x 8 Linpack testing on SYS-4028GR-TXRT

dhenzjhen

Member

PigLover

Moderator

dhenzjhen

Member

Lukas Goe

New Member

Patrick

Administrator