Hi,
I'm currently trying to get the best performances I can on FreeBSD with NVMe drives, in order to know how many servers my workflow is going to require (reading a lot of 60MB dpx files per seconds).
For testing purpose, I'm using 4x 1Tb Samsung 990 Pro on a system equipped with a threadripper 3970x and 128GB of DDR4 3200.
All tests are done using iozone, with various block size set according to the recordsize set on the zfs volume, and with 5 files of 60GB each, in order to fill the arc cache.
Here is an example of the command used, with 1M blocks :
iozone -R -l 5 -u 5 -r 1M -s 60g -F /nvme1/tmp1 /nvme1/tmp2 /nvme1/tmp3 /nvme1/tmp4 /nvme1/tmp5
I've stumbled on a bottleneck, and I can't figure out what's the cause. Indeed, the system can't go higher than 13GB/s in read.
When tested separately, creating 4 pools of 1 drive, each NVMe has the same speed : around 6GB/s in read.
When doing a simultaneous test on two pools, performances are doubled, I've got 12GB/s in read.
If I do the same simultaneous test on 3 pools I've got almost the same results, around 13GB/s, as if I reached a limit somewhere.
It seems confirmed as when doing the test on my 4 pools, I still got 13GB/s.
If I look at my CPU loading when doing the 4 tests simultaneously, it's used around 40 percent, with peak at 60 percent.
I also tested with 2 mirror pools and got the same results: tested separately, each got 12GB/s in read. Tested simultaneously, the cumulative bandwidth is limited to 13GB/s.
Same limit when doing a single Z1 pool with my 4 drives, it's stuck at 13GB/s.
Does anyone have any idea what's happening, or can enlighten me on how I can find what's limiting my system ?
I'm currently trying to get the best performances I can on FreeBSD with NVMe drives, in order to know how many servers my workflow is going to require (reading a lot of 60MB dpx files per seconds).
For testing purpose, I'm using 4x 1Tb Samsung 990 Pro on a system equipped with a threadripper 3970x and 128GB of DDR4 3200.
All tests are done using iozone, with various block size set according to the recordsize set on the zfs volume, and with 5 files of 60GB each, in order to fill the arc cache.
Here is an example of the command used, with 1M blocks :
iozone -R -l 5 -u 5 -r 1M -s 60g -F /nvme1/tmp1 /nvme1/tmp2 /nvme1/tmp3 /nvme1/tmp4 /nvme1/tmp5
I've stumbled on a bottleneck, and I can't figure out what's the cause. Indeed, the system can't go higher than 13GB/s in read.
When tested separately, creating 4 pools of 1 drive, each NVMe has the same speed : around 6GB/s in read.
When doing a simultaneous test on two pools, performances are doubled, I've got 12GB/s in read.
If I do the same simultaneous test on 3 pools I've got almost the same results, around 13GB/s, as if I reached a limit somewhere.
It seems confirmed as when doing the test on my 4 pools, I still got 13GB/s.
If I look at my CPU loading when doing the 4 tests simultaneously, it's used around 40 percent, with peak at 60 percent.
I also tested with 2 mirror pools and got the same results: tested separately, each got 12GB/s in read. Tested simultaneously, the cumulative bandwidth is limited to 13GB/s.
Same limit when doing a single Z1 pool with my 4 drives, it's stuck at 13GB/s.
Does anyone have any idea what's happening, or can enlighten me on how I can find what's limiting my system ?