performance of KVM/QEMU disks on ZFS volumes

Discussion in 'FreeBSD and FreeNAS' started by Bronek, Dec 27, 2016.

  1. Bronek

    Bronek New Member

    Jun 23, 2015
    Likes Received:
    Apologies if this question does not below to this forum. I am running a small number of Windows 10 guests on libvirt-2.5 + qemu-2.7 + linux-4.8 with ZFS . Disks of the guests are setup on ZFS zvols , for example disk C: of guest "lublin" ...

        <disk type='block' device='disk'>
          <driver name='qemu' type='raw' cache='writeback'/>
          <source dev='/dev/zvol/zdata/vdis/lublin'/>
          <target dev='sda' bus='scsi'/>
          <boot order='1'/>
          <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    . . .
        <controller type='scsi' index='0' model='virtio-scsi'>
          <driver queues='4'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    ... is mapped to:

    root@gdansk /etc/modprobe.d # zfs get all zdata/vdis/lublin
    NAME               PROPERTY              VALUE                  SOURCE
    zdata/vdis/lublin  type                  volume                 -
    zdata/vdis/lublin  creation              Thu Jul 30 20:35 2015  -
    zdata/vdis/lublin  used                  1.14T                  -
    zdata/vdis/lublin  available             2.28T                  -
    zdata/vdis/lublin  referenced            136G                   -
    zdata/vdis/lublin  compressratio         1.00x                  -
    zdata/vdis/lublin  reservation           none                   default
    zdata/vdis/lublin  volsize               160G                   local
    zdata/vdis/lublin  volblocksize          8K                     -
    zdata/vdis/lublin  checksum              on                     default
    zdata/vdis/lublin  compression           off                    inherited from zdata/vdis
    zdata/vdis/lublin  readonly              off                    default
    zdata/vdis/lublin  copies                1                      default
    zdata/vdis/lublin  refreservation        165G                   local
    zdata/vdis/lublin  primarycache          all                    default
    zdata/vdis/lublin  secondarycache        all                    default
    zdata/vdis/lublin  usedbysnapshots       883G                   -
    zdata/vdis/lublin  usedbydataset         136G                   -
    zdata/vdis/lublin  usedbychildren        0                      -
    zdata/vdis/lublin  usedbyrefreservation  150G                   -
    zdata/vdis/lublin  logbias               latency                default
    zdata/vdis/lublin  dedup                 off                    default
    zdata/vdis/lublin  mlslabel              none                   default
    zdata/vdis/lublin  sync                  standard               default
    zdata/vdis/lublin  refcompressratio      1.00x                  -
    zdata/vdis/lublin  written               14.7G                  -
    zdata/vdis/lublin  logicalused           990G                   -
    zdata/vdis/lublin  logicalreferenced     134G                   -
    zdata/vdis/lublin  snapshot_limit        none                   default
    zdata/vdis/lublin  snapshot_count        none                   default
    zdata/vdis/lublin  snapdev               hidden                 default
    zdata/vdis/lublin  context               none                   default
    zdata/vdis/lublin  fscontext             none                   default
    zdata/vdis/lublin  defcontext            none                   default
    zdata/vdis/lublin  rootcontext           none                   default
    zdata/vdis/lublin  redundant_metadata    all                    default
    The ZFS is also setup with a dedicated SLOG 4GB device running on a NVMe and 250GB LARC2 also running on the same NVMe DC3700 , there is also 16GB of RAM for LARC to use (of total 128GB available):

    root@gdansk /etc/modprobe.d # cat zfs.conf
    # Enforce max ZFS ARC size to 16GB = 16*1024*1024*1024 = 17179869184
    options zfs zfs_arc_max=17179869184
    # Enforce synchronous scsi scan, to prevent zfs driver loading before disks are available
    options scsi_mod scan=sync
    Despite this all, very often the disk performance from within guest is unsatisfactory. And by this I mean IO speed reported by task manager rarely exceeding 3MB/s, sometimes dipping below 1MB/s. This happens when doing some IO intensive work on C: like updating software (e.g. when Adobe cloud is upgrading Photoshop). Often during these times some Windows processes are becoming unresponsive for short periods of time. At the some time the host is fine and responsive.

    Any hints how to improve that? Or at least tried-and-tested setup to use when a ZVOL is used as a QEMU guest disk? Or is such setup generally not recommended and I should move pronto to QCOW2?
  2. Terry Kennedy

    Terry Kennedy Well-Known Member

    Jun 25, 2015
    Likes Received:
    You might want to try the Linux forum here. This side is pretty much FreeBSD (where ZFS is natively included). For example, I don't know what version of ZFS is.
    That's something that needs to be investigated at all of the host OS layer, virtualization layer, and guest OS layer. It's a lot easier to track down these things when the host and guest OS are the same "brand", since bug reports can't be closed with "oh, it's your other operating system... over there" type things.

    ZFS even on slow disks and CPUs should be able to achieve at least 100Mbyte/sec. Here's a graph from 6+ years ago showing 500MB/sec writes for extended periods (hours):

  3. gigatexal

    gigatexal I'm here to learn

    Nov 25, 2012
    Likes Received:
    does the windows host have the latest virt-io drivers?
  4. Bronek

    Bronek New Member

    Jun 23, 2015
    Likes Received:
    Ah that's a good question. No, the Windows guests are using old version 0.112. I will upgrade to 0.126 and report back. BTW I cannot find an option to move this thread to a more appropriate group, is this something a moderator only can do? Or should I simply start a new thread in Linux group, or look more closely again?
  5. Bronek

    Bronek New Member

    Jun 23, 2015
    Likes Received:
    I do have some guests from the same distribution (Arch) and the same kernel version as host, and using guest disk setup the same way as Windows 10 guests, i.e. block device mapped to ZVOL. But I have not tried benchmarking IO on those, this is something I need to look into. Thanks for help!
  6. Markus

    Markus Member

    Oct 25, 2015
    Likes Received:
    What's about the zpool-Zettings?

    I accidentally activated deduplication on my pool and doesn't have enough RAM... - After the buffer is full my speed is very low...

  7. Bronek

    Bronek New Member

    Jun 23, 2015
    Likes Received:
    I am certain I do not have deduplication enabled. Also the bad guest performance happens when the host has at least 20GB memory free. Actually I could safely increase zfs_arc_max from 16GB (the host has 128GB of which usually only half is used by guests), is there some way to calculate how much ARC I need for a given pool size and/or L2ARC size?
    Last edited: Dec 28, 2016
  8. Bronek

    Bronek New Member

    Jun 23, 2015
    Likes Received:
    I have an idea what this might be (although not 100% sure). The performance jumped when I switched on option (in disk policies, Windows 10 guest) "Turn off Windows write-cache buffer flushing". That is, I disabled flushing of "disk" buffer, when the disk is actually ZVOL on ZFS. As far as I understand, the buffer flushing is synchronous (i.e. blocking writes in guest) and it kicks every second. I guess this is mapped (via scsi virtio and qemu) to fsync , which in turn triggers synchronous flush of ZIL . When there are plenty of writes in Windows, this turns to even more writes in ZFS and the result is that eventually under heavy IO load, disk performance in guest drops to embarrassing level. Disabling flushing of "disk" buffer means that either 1) ZIL flushing reverts back to as-designed, which IIRC is every 5s, or 2) it is still called by Windows guest, but is no longer blocking writes in guest. Either way, the single switch enabled me to regain good performance in guest.
    gigatexal likes this.
Similar Threads: performance KVM/QEMU
Forum Title Date
FreeBSD and FreeNAS FreeNAS performance to Proxmox ZFS Dec 19, 2018
FreeBSD and FreeNAS How best to determine what to upgrade for better FreeNAS performance? Oct 1, 2018
FreeBSD and FreeNAS vmdk on lsi scsi controller limiting SLOG for sync write performance? Jan 8, 2018
FreeBSD and FreeNAS 10Gbe performance issue in FreeNas 11 Jul 29, 2017
FreeBSD and FreeNAS Question regarding expected performance of my FreeNAS build. Jun 28, 2017

Share This Page