Titan X pascal benchmark on SYS-4028GR-TR

dhenzjhen

Member
Sep 14, 2016
38
55
18
San Jose, California


Motherboard: X10DRG-O / Product Name: SYS-4028GR-TR
BIOS: 7/27/2016
IPMI: 3.44
CPU: E5 2689 3.1Ghz V4 x 2
Memory: Samsung 16GB x 24
GPU: Nvidia Titan X Pascal x 10
OS: Redhat Linux 7.2 x64
Driver: 367.57
Software: HPL binary from Nvidia (NDA)


#./run_me_10_gpu
=======================================================================
HPLinpack 2.1 -- High-Performance Linpack benchmark -- October 26, 2012
Written by A. Petitet and R. Clint Whaley, Innovative Computing Laboratory, UTK
Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK
Modified by Julien Langou, University of Colorado Denver
=======================================================================

An explanation of the input/output parameters follows:
T/V : Wall time / encoded variant.
N : The order of the coefficient matrix A.
NB : The partitioning blocking factor.
P : The number of process rows.
Q : The number of process columns.
Time : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.

The following parameter values will be used:

N : 117760 0 0 0
NB : 384
PMAP : Row-major process mapping
P : 5
Q : 2
PFACT : Left
NBMIN : 8
NDIV : 2
RFACT : Left
BCAST : 1ring
DEPTH : 0
SWAP : Spread-roll (long)
L1 : no-transposed form
U : transposed form
EQUIL : no
ALIGN : 8 double precision words

--------------------------------------------------------------------------------

- The matrix A is randomly generated for each test.
- The following scaled residual check will be computed:
||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
- The relative machine precision (eps) is taken to be 1.110223e-16
- Computational tests pass if scaled residuals are less than 16.0

gpu_dgemm_split from environment variable 1.000
test_loops from environment variable 1

******** TESTING SYSTEM PARAMETERS ********
PARAM [UNITS] MIN MAX AVG
----- ------- --- --- ---
CPU :
CPU_BW [GB/s ] 4.1 4.9 4.5
CPU_FP [GFLPS] 60.8 68.6 64.7
PCIE :
H2D_BW [GB/s ] 2.3 2.4 2.3
D2H_BW [GB/s ] 2.4 2.9 2.7
BID_BW [GB/s ] 4.0 4.2 4.1
GPU :
GPU_BW [GB/s ] 382 385 384
GPU_FP [GFLPS]
NB = 128 382 393 388
NB = 256 387 396 391
NB = 384 388 397 393
NB = 512 389 398 394
NB = 640 390 399 394
NB = 768 390 399 394
NB = 896 390 399 394
NB = 1024 389 399 394
NET :
NET_BW [MB/s ]
8 B 3 5 4
64 B 52 108 63
512 B 156 188 171
4 KB 696 743 719
32 KB 1478 1706 1590
256 KB 2579 2708 2631
2048 KB 3099 3452 3199
16384 KB 2990 3304 3074
NET_LAT [ us ] 0.5 1.6 0.8

displaying Prog:%complete, N:columns, Time:seconds
iGF:instantaneous GF, GF:avg GF, GF_per: process GF


Per-Process Host Memory Estimate: 11.50 GB (MAX) 11.26 GB (MIN)

PCOL: 0 GPU_COLS: 59009 CPU_COLS: 0
PCOL: 1 GPU_COLS: 58753 CPU_COLS: 0
test_loop: 1 of 1
2016-11-02 15:52:02.227

Prog= 1.94% N_left= 116992 Time= 5.96 Time_left= 300.80 iGF= 3548.99 GF= 3548.99 iGF_per= 354.90 GF_per= 354.90
Prog= 3.86% N_left= 116224 Time= 11.63 Time_left= 289.42 iGF= 3687.18 GF= 3616.31 iGF_per=
I uploaded the rest of the time calculations here http://chmod755.sdf-us.org/titanxpascal

================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR00L2L8 117760 384 5 2 316.71 3.438e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0028690 ...... PASSED

Per-Process Host Memory Estimate: 0.00 GB (MAX) 0.00 GB (MIN)

PCOL: 0 GPU_COLS: 1 CPU_COLS: 0
PCOL: 1 GPU_COLS: 1 CPU_COLS: 0
test_loop: 1 of 1
2016-11-02 15:57:33.703
2016-11-02 15:57:33.703
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR00L2L8 0 384 5 2 0.00 0.000e+00
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0000000 ...... PASSED

Per-Process Host Memory Estimate: 0.00 GB (MAX) 0.00 GB (MIN)

PCOL: 1 GPU_COLS: 1 CPU_COLS: 0
PCOL: 0 GPU_COLS: 1 CPU_COLS: 0
test_loop: 1 of 1
2016-11-02 15:57:33.802
2016-11-02 15:57:33.802
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR00L2L8 0 384 5 2 0.00 0.000e+00
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0000000 ...... PASSED

Per-Process Host Memory Estimate: 0.00 GB (MAX) 0.00 GB (MIN)

PCOL: 0 GPU_COLS: 1 CPU_COLS: 0
PCOL: 1 GPU_COLS: 1 CPU_COLS: 0
test_loop: 1 of 1
2016-11-02 15:57:33.898
2016-11-02 15:57:33.898
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR00L2L8 0 384 5 2 0.00 0.000e+00
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0000000 ...... PASSED
================================================================================

Finished 4 tests with the following results:
4 tests completed and passed residual checks,
0 tests completed and failed residual checks,
0 tests skipped because of illegal input values.
--------------------------------------------------------------------------------

End of Tests.
================================================================================


Note: this is just a quick benchmark
 

dhenzjhen

Member
Sep 14, 2016
38
55
18
San Jose, California
# nvidia-smi
Wed Nov 2 15:59:56 2016
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.57 Driver Version: 367.57 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+=============|
| 0 TITAN X (Pascal) Off | 0000:04:00.0 Off | N/A |
| 23% 42C P0 56W / 250W | 0MiB / 12189MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 TITAN X (Pascal) Off | 0000:05:00.0 Off | N/A |
| 24% 44C P0 59W / 250W | 0MiB / 12189MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 TITAN X (Pascal) Off | 0000:06:00.0 Off | N/A |
| 24% 44C P0 59W / 250W | 0MiB / 12189MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 TITAN X (Pascal) Off | 0000:07:00.0 Off | N/A |
| 25% 45C P0 57W / 250W | 0MiB / 12189MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 4 TITAN X (Pascal) Off | 0000:08:00.0 Off | N/A |
| 27% 49C P0 58W / 250W | 0MiB / 12189MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 5 TITAN X (Pascal) Off | 0000:0B:00.0 Off | N/A |
| 23% 42C P0 57W / 250W | 0MiB / 12189MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 6 TITAN X (Pascal) Off | 0000:0C:00.0 Off | N/A |
| 24% 45C P0 58W / 250W | 0MiB / 12189MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 7 TITAN X (Pascal) Off | 0000:0D:00.0 Off | N/A |
| 24% 44C P0 57W / 250W | 0MiB / 12189MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 8 TITAN X (Pascal) Off | 0000:0E:00.0 Off | N/A |
| 24% 44C P0 57W / 250W | 0MiB / 12189MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 9 TITAN X (Pascal) Off | 0000:0F:00.0 Off | N/A |
| 23% 43C P0 58W / 250W | 0MiB / 12189MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=====================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+