Intel memory latency and bandwidth benchmark tool

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

BackupProphet

Well-Known Member
Jul 2, 2014
1,147
725
113
Stavanger, Norway
intellistream.ai
Have anyone tested this?
Intel® Memory Latency Checker v3.1a | Intel® Software

Seems really cool. Here you can also see how much overhead a NUMA setup adds.

Result server: Dual Xeon L5630 6x4GB PC3-10600R

Code:
Intel(R) Memory Latency Checker - v3.1a
Measuring idle latencies (in ns)...
  Numa node
Numa node  0  1
  0  89.4  132.4
  1  131.9  88.4

Measuring Peak Memory Bandwidths for the system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads  :  25363.8
3:1 Reads-Writes :  25131.9
2:1 Reads-Writes :  25691.8
1:1 Reads-Writes :  26879.7
Stream-triad like:  27770.1

Measuring Memory Bandwidths between nodes within system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
  Numa node
Numa node  0  1
  0  14952.5  9618.8
  1  9602.5 14959.1

Measuring Loaded Latencies for the system
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Inject  Latency Bandwidth
Delay  (ns)  MB/sec
==========================
00000  108.46  24999.7
00002  108.40  25011.0
00008  108.55  25055.5
00015  108.61  25074.7
00050  108.57  25059.3
00100  103.01  20258.9
00200  94.35  10744.6
00300  92.59  7317.7
00400  91.58  5675.6
00500  91.04  4687.5
00700  90.42  3553.8
01000  90.01  2703.8
01300  89.81  2248.1
01700  89.63  1887.0
02500  89.46  1512.8
03500  89.36  1285.7
05000  89.30  1115.4
09000  89.23  938.9
20000  89.16  817.5

Measuring cache-to-cache transfer latency (in ns)...
Local Socket L2->L2 HIT  latency  36.1
Local Socket L2->L2 HITM latency  41.1
Remote Socket LLC->LLC HITM latency (data address homed in writer socket)
  Reader Numa Node
Writer Numa Node  0  1
  0  -  136.5
  1  136.2  -
Remote Socket LLC->LLC HITM latency (data address homed in reader socket)
  Reader Numa Node
Writer Numa Node  0  1
  0  -  104.8
  1  104.8  -
Result workstation: 4670K @ 3.8ghz 2x4GB PC3-10600

Code:
Intel(R) Memory Latency Checker - v3.1a
Measuring idle latencies (in ns)...
  Memory node
Socket  0
  0  57.5

Measuring Peak Memory Bandwidths for the system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads  :  19785.9
3:1 Reads-Writes :  19022.1
2:1 Reads-Writes :  18747.3
1:1 Reads-Writes :  18501.3
Stream-triad like:  18768.9

Measuring Memory Bandwidths between nodes within system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
  Memory node
Socket  0
  0  19679.4

Measuring Loaded Latencies for the system
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Inject  Latency Bandwidth
Delay  (ns)  MB/sec
==========================
00000  174.56  19525.6
00002  177.05  19538.7
00008  163.46  19516.8
00015  158.24  19420.0
00050  135.28  19040.6
00100  114.88  18011.3
00200  86.43  12093.3
00300  80.91  8866.8
00400  75.58  7052.5
00500  71.22  5764.8
00700  73.08  4488.5
01000  67.16  3512.8
01300  68.03  2912.0
01700  67.68  2492.1
02500  65.76  1996.4
03500  71.90  1606.3
05000  70.80  1378.7
09000  65.97  1260.1
20000  66.85  1088.3

Measuring cache-to-cache transfer latency (in ns)...
Local Socket L2->L2 HIT  latency  19.8
Local Socket L2->L2 HITM latency  23.1
 
Last edited:
  • Like
Reactions: Patrick

William

Well-Known Member
May 7, 2015
789
252
63
67
Thank you for the link and post. Might try this one out on a few systems.