Ok - I have tested a bit:
My previous quoted performance was via ISCSi also a windows VM using CrystalDiskMark. So numbers are diffent and not really comparable.
Setup:
Windows VM 16 cores, hosted in ESXi on a datastore on NFS hosted on my mirror of P4510's on another machine with 40GBps network in between. Windows file cache turned off.
using diskspd.exe -c10G -d60 -s -o32 -b128k -L
queue depth 32, sequential reads, block size 128k
One thread
Code:
bytes | I/Os | MiB/s | I/O per s | AvgLat | LatStdDev
-----------------------------------------------------------------------------------------------------
86712254464 | 661562 | 1377.94 | 11023.50 | 2.902 | 0.126
4 threads
Code:
bytes | I/Os | MiB/s | I/O per s | AvgLat | LatStdDev
-----------------------------------------------------------------------------------------------------
97835679744 | 746427 | 1554.70 | 12437.57 | 8.444 | 108.991
8 threads
Code:
bytes | I/Os | MiB/s | I/O per s | AvgLat | LatStdDev
-----------------------------------------------------------------------------------------------------
93093232640 | 710245 | 1479.34 | 11834.68 | 18.926 | 1.483
Random Read/write tests:
diskspd.exe -c10G -d60 -r -o32 --Sh -b128k -w40 -L
1 thread
Code:
bytes | I/Os | MiB/s | I/O per s | AvgLat | LatStdDev
77441794048 | 590834 | 1230.62 | 9844.96 | 3.250 | 1.652
4 threads
Code:
bytes | I/Os | MiB/s | I/O per s | AvgLat | LatStdDev
86493102080 | 659890 | 1374.45 | 10995.60 | 11.638 | 14.581
8 threads
Code:
bytes | I/Os | MiB/s | I/O per s | AvgLat | LatStdDev
82535120896 | 629693 | 1311.56 | 10492.46 | 24.395 | 15.338
ESXi datastore hosted on ISCSi - queue depth 32, sequential reads, block size 128k
One thread, sequential
Code:
bytes | I/Os | MiB/s | I/O per s | AvgLat | LatStdDev
-----------------------------------------------------------------------------------------------------
97242841088 | 741904 | 1545.28 | 12362.21 | 2.588 | 1.026
One thread, sequential, queue depth 1
Code:
bytes | I/Os | MiB/s | I/O per s | AvgLat | LatStdDev
-----------------------------------------------------------------------------------------------------
100770775040 | 768820 | 1601.33 | 12810.62 | 0.078 | 0.218
4 threads, sequential, queue depth 1
Code:
bytes | I/Os | MiB/s | I/O per s | AvgLat | LatStdDev
-----------------------------------------------------------------------------------------------------
146684903424 | 1119117 | 2330.96 | 18647.69 | 0.214 | 0.209
8 threads, sequential, queue depth 1
Code:
bytes | I/Os | MiB/s | I/O per s | AvgLat | LatStdDev
-----------------------------------------------------------------------------------------------------
155370782720 | 1185385 | 2469.44 | 19755.53 | 0.404 | 0.312
For comparison - a datastore hosted on a Intel Optane 900P locally on the ESXi - I usually only use this for swap
4 threads, sequential, queue depth 1
Code:
bytes | I/Os | MiB/s | I/O per s | AvgLat | LatStdDev
-----------------------------------------------------------------------------------------------------
172875579392 | 1318936 | 2747.15 | 21977.18 | 0.181 | 0.066
8 threads, sequential, queue depth 1
Code:
bytes | I/Os | MiB/s | I/O per s | AvgLat | LatStdDev
-----------------------------------------------------------------------------------------------------
166860816384 | 1273047 | 2651.58 | 21212.61 | 0.376 | 0.156
8 threads, sequential, queue depth 32
Code:
bytes | I/Os | MiB/s | I/O per s | AvgLat | LatStdDev
-----------------------------------------------------------------------------------------------------
131612409856 | 1004123 | 2091.44 | 16731.54 | 13.387 | 1.279
All in all decent numbers for my not so organized test - and not quite the numbers like the specs, but I think the network takes a lot of the iops.
Having the datastore via ISCSi is definately faster both in terms of raw transfer speed, but also latency, local nvme is naturally faster, but having P4510's locally would probably be very fast.
I don't have any SAS ssd's, so I guess I would need to buy a couple smaller ones and see how that performs before deciding to pony up for bigger ones.