ZFS performance host vs VM

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

storageNator

New Member
Aug 6, 2023
11
0
1
Hi,

I am currently testing my RaidZ1 setup.

The plan was actually to create the ZFS pool (4x1TB PCIe NVME SSD) with Proxmox and then pass it as a disk to a VM, among others.
In my benchmarks with fio, however, I noticed that the performance on the host was significantly higher (approx. 50%) than in the VM.

Host (PVE 7.4):
write_throughput: (g=0): rw=write, bs=(R) 4096KiB-4096KiB, (W) 4096KiB-4096KiB, (T) 4096KiB-4096KiB, ioengine=libaio, iodepth=16

write: IOPS=243, BW=976MiB/s (1023MB/s)(858GiB/900075msec); 0 zone resets
WRITE: bw=976MiB/s (1023MB/s), 976MiB/s-976MiB/s (1023MB/s-1023MB/s), io=858GiB (921GB), run=900075-900075msec

VM (Debian 12)
I can't find my results right now, but I did the same fio benchmark here and the write was around 500 MB/s
I then tried a lot of optimizations, such as setting the CPU type to host, disabling aiothreads and specte mitigations, but none of that helped much.
The performance only improved slightly when I changed the block size in the PVE UI to 1MB before creating the VM disk, but even then it was only around 650MB/s


This is the script for my fio benchmark.
The benchmark was running for 15 minutes

Code:
IODEPTH=16
NUMJOBS=1
BLOCKSIZE=4M
RUNTIME=900


#TEST_DIR=/mnt/testLVM3/fiotest
TEST_DIR=/testZFS/fiotest
fio --name=write_throughput --directory=$TEST_DIR --numjobs=$NUMJOBS \
--size=1200G --time_based --runtime=$RUNTIME --ramp_time=2s --ioengine=libaio \
--direct=1 --bs=$BLOCKSIZE --iodepth=$IODEPTH --rw=randwrite \
--group_reporting=1 --iodepth_batch_submit=$IODEPTH \
--iodepth_batch_complete_max=$IODEPTH
I created the ZFS pool in Proxmox and the VM with default settings

create -fo 'ashift=12' testZFS raidz /dev/disk/by-id/nvme-Lexar_SSD_NM790_1TB_NLD648R000186P2202 /dev/disk/by-id/nvme-KIOXIA-EXCERIA_PLUS_G3_SSD_8DSKF3M9Z0E9 /dev/disk/by-id/nvme-KIOXIA-EXCERIA_PLUS_G3_SSD_8DSKF3SLZ0E9 /dev/disk/by-id/nvme-Lexar_SSD_NM790_1TB_NLD648R000184P2202

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Does anyone have an idea where the big performance difference between host and VM comes from?
I expected some performance degradation with the VM, but I was thinking around 10%, not 30% or even 50%

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

In addition, the performance for a Z1 raid also seems very bad to me, when testing the individual disks (only tested on the host) I had about 50% more performance in the same test

But according to this formula, I should have had a factor of 3 higher performance
Streaming write speed: (N - p) * Streaming write speed of single drive
https://static.ixsystems.co/uploads/2020/09/ZFS_Storage_Pool_Layout_White_Paper_2020_WEB.pdf
 

mbosma

Member
Dec 4, 2018
76
58
18
Qemu schedules the io of the disks with the main thread and therefore lands in a queue handled by the cpu.
I had the same results testing with Intel Optane and ram disks on older cpu's, at a certain point you'll be cpu bound.
What cpu are you using?

Also keep in mind you're using consumer drives which isn't recommended for ZFS.
 

ttabbal

Active Member
Mar 10, 2016
747
207
43
47
If your application can run in a container, I found I got much better performance that way. I just figured that the VM overhead was causing issues. Proxmox can auto-mount a host path on the container and it appears to perform near native.

As for performance on Z1... I don't know about that paper. I didn't observe that even on their platform. I mean, to write a block you need to write to all the disks. So there are some parallel paths, but I don't think I ever saw that level of performance on any raidz.
 

storageNator

New Member
Aug 6, 2023
11
0
1
What cpu are you using?
Ryzen 5 5600 6-Core Processor

Also keep in mind you're using consumer drives which isn't recommended for ZFS.
I know, but I will rarely need a lot of performance, so I think these will suffice

If your application can run in a container, I found I got much better performance that way
That's probably what it will come down to if I can't fix the big performance differences in the VM

As for performance on Z1... I don't know about that paper. I didn't observe that even on their platform
What would your expectations be?

I've just done some more research and found something else here
There the formula also fits approximately

1x 4TB, single drive, 3.7 TB, w=108MB/s , rw=50MB/s , r=204MB/s
5x 4TB, raidz1 (raid5), 15.0 TB, w=469MB/s , rw=79MB/s , r=598MB/s