NFSv3 vs NFSv4 vs iSCSI for ESXI datastores

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

zxv

The more I C, the less I see.
Sep 10, 2017
156
57
28
Instead of iSCSI you may try iSER? Btw anyone know when ESXi will add NVMeOF initiator?
I've started testing iSER, and it does have lower latency, but it adds a bunch of new issues.
Vsphere 6 cannot mount iscsi/iser LUNS that have a 4K block size.

I've reformatted iodrive2s as 512 byte block size, and can mount them via iSER, but on the target side, I/O fails with "unaligned DMA" kernel messages.

iSER exported zvols work, but has low performance compared to the NVMe or SSD raw device.

So yea, NVMeOF seems the way forward, and I'm looking at avoiding ESXI, and using KVM instead to do so, since there are fewer limitations there.
 

zxv

The more I C, the less I see.
Sep 10, 2017
156
57
28
I tried iSER a couple of weeks ago on the latest 6.7. Using latest mellanox drivers for 50gb card. Guests stopped responding. Looked at esxi console. Purple screen of death :( I haven't had the time or motivation to take another look at it...
I've seen some purple screens as well for certain combinations of versions. I've not see purple screens when sticking with supported versions of packages though.

Whats' worse yet though, is, if an iser device attaches but I/O does not complete, the system will hang, and then when the system reboots it re-attaches the device upon boot, and hangs early during boot. The only solution is to reinstall the whole system.

To avoid this, I'd recommend: 1) test using iscsi first (not iser), to make sure there are not configuration issues, then 2) test RDMA bandwidth and make sure flow control is working, then finally, test iSER.
 

efschu2

Member
Feb 14, 2019
68
12
8
I took your suggestion and did some research this weekend. Conclusion ZoL has much better performance on CentOS, but is still slow.

For comparison, same hardware, default settings ZoL 0.7.9, benchmarked with pg_test_fsync

Ubuntu 2200 iops
Debian 2000 iops
CentOS 8000 iops

FreeBSD 16000 iops

Ubuntu + XFS 34000 iops
Ubuntu + EXT4 32000 iops
Ubuntu + BcacheFS 14000 iops
You could try CentOS instead of Ubuntu for ZFS and then retest your iSER target (but would recommend SCST over iLO)

Btw could you share some benchmark results on iSER ramdisk target?
 

dswartz

Active Member
Jul 14, 2011
610
79
28
I've seen some purple screens as well for certain combinations of versions. I've not see purple screens when sticking with supported versions of packages though.

Whats' worse yet though, is, if an iser device attaches but I/O does not complete, the system will hang, and then when the system reboots it re-attaches the device upon boot, and hangs early during boot. The only solution is to reinstall the whole system.

To avoid this, I'd recommend: 1) test using iscsi first (not iser), to make sure there are not configuration issues, then 2) test RDMA bandwidth and make sure flow control is working, then finally, test iSER.
Define 'supported versions of packages'? I installed latest&greatest OFED driver for ESXi 6.7. Backend is CentOS 7.5, same thing.
 

zxv

The more I C, the less I see.
Sep 10, 2017
156
57
28
Define 'supported versions of packages'? I installed latest&greatest OFED driver for ESXi 6.7. Backend is CentOS 7.5, same thing.
The reason I mention 'supported versions of packages' is that they offer some level of QA.

For others who aren't already familiar, here's some links to mellanox releases:
MLNX_OFED: Firmware - Driver Compatibility Matrix (for linux distros)
VMware InfiniBand Driver: Firmware - Driver Compatibility Matrix

Just curious, which OFED release are you using for ESXI 6.7, and which model (CX3/Pro/4/5)?
I ask because recent iSER releases are separate, and that makes for more combinations to test.
 

Rand__

Well-Known Member
Mar 6, 2014
6,626
1,767
113
VMWare is not officially supporting IB any more since 6.5 or 6.7, so your only chance for that are the old drivers; 1.8.2.5 or so iirc. There was a thread somewhere here where one person said he got them to work on 6.7 but have not heard much of it.
 

zxv

The more I C, the less I see.
Sep 10, 2017
156
57
28
You could try CentOS instead of Ubuntu for ZFS and then retest your iSER target (but would recommend SCST over iLO)

Btw could you share some benchmark results on iSER ramdisk target?
That's an excellent suggestion. I'll probably try centos either way at some point, but I've not spent any time with it and it'd require some catching up compared to where I am with Ubuntu and/or Debian.

I've looked at both LIO and SCST. SCST can do certain configuration options, such as logical block size, which LIO doesn't seem to allow. I'm trying LIO currently, just because my impression is that it's currently getting more development attention. Once the dust has settled from other driver issues, I'll plan to look at it.

Which do you use on Centos? Just curious about whether there are differences how they are supported there.
 

dswartz

Active Member
Jul 14, 2011
610
79
28
VMWare is not officially supporting IB any more since 6.5 or 6.7, so your only chance for that are the old drivers; 1.8.2.5 or so iirc. There was a thread somewhere here where one person said he got them to work on 6.7 but have not heard much of it.
I can't tell who you are responding to here, me or someone else? If me, I'm not using infiniband, it's a ConnectX-5 (MT27800 ).
 

zxv

The more I C, the less I see.
Sep 10, 2017
156
57
28
VMWare is not officially supporting IB any more since 6.5 or 6.7, so your only chance for that are the old drivers; 1.8.2.5 or so iirc. There was a thread somewhere here where one person said he got them to work on 6.7 but have not heard much of it.
Interesting, if you can find it, that's good news. There's certainly unsupported stuff in ESXI 6.x that still works (LLDP for example).
 

zxv

The more I C, the less I see.
Sep 10, 2017
156
57
28
I can't tell who you are responding to here, me or someone else? If me, I'm not using infiniband, it's a ConnectX-5 (MT27800 ).
Which driver and iSER version?

I've got both CX3Pro and CX4 VPI, on an Arista 40G switch.
 

dswartz

Active Member
Jul 14, 2011
610
79
28
The reason I mention 'supported versions of packages' is that they offer some level of QA.

For others who aren't already familiar, here's some links to mellanox releases:
MLNX_OFED: Firmware - Driver Compatibility Matrix (for linux distros)
VMware InfiniBand Driver: Firmware - Driver Compatibility Matrix

Just curious, which OFED release are you using for ESXI 6.7, and which model (CX3/Pro/4/5)?
I ask because recent iSER releases are separate, and that makes for more combinations to test.
nmlx5-core 4.17.9.12-1vmw.670.0.0.8169922 VMW VMwareCertified 2018-12-19
nmlx5-rdma 4.17.9.12-1vmw.670.0.0.8169922 VMW VMwareCertified 2018-12-19

I note the vmware site now has 4.17.14.2? Might be worth looking at that?
 

zxv

The more I C, the less I see.
Sep 10, 2017
156
57
28
nmlx5-core 4.17.9.12-1vmw.670.0.0.8169922 VMW VMwareCertified 2018-12-19
nmlx5-rdma 4.17.9.12-1vmw.670.0.0.8169922 VMW VMwareCertified 2018-12-19

I note the vmware site now has 4.17.14.2? Might be worth looking at that?
I tried 4.17.14.2, and, in the same configuration as the inbox drivers, the iSER module would not load.
There were not kernel or other log messages that showed why.
This was with no module parameters set for iSER.

So I'm running the 6.7U1 inbox drivers at the moment.
nmlx5-core 4.17.9.12-1vmw.670.1.28.10302608 VMW
nmlx5-rdma 4.17.9.12-1vmw.670.1.28.10302608 VMW
iser 1.0.0.0-1vmw.670.1.28.10302608 VMW


I'd prefer to run the mellanox drivers, so I'll probably try it again at some point.
 

zxv

The more I C, the less I see.
Sep 10, 2017
156
57
28
I tried 4.17.14.2 drivers with MLNX-NATIVE-ESX-ISER_1.0.0.2-10EM-650.0.0.4598673.zip on ESXI 6.7U1.
It did load the iSER module, but pink screened upon attaching the target.
 

dswartz

Active Member
Jul 14, 2011
610
79
28
I tried 4.17.14.2 drivers with MLNX-NATIVE-ESX-ISER_1.0.0.2-10EM-650.0.0.4598673.zip on ESXI 6.7U1.
It did load the iSER module, but pink screened upon attaching the target.
My apologies, I misspoke. I had a pink screen, not a purple screen. I think you're not supposed to use the iSER module with 6.7 - iSER is built-in to esxi.
 

dswartz

Active Member
Jul 14, 2011
610
79
28
Okay, 1 hour of my life I will never get back lol. Memory refreshed. I now recall I had tried to use the 6.5 iser driver and that indeed pink screens. According to this blog post (Testing Mellanox iSER driver in ESXi environment.), iSER is baked into 6.7, and it should 'just work'. Never did for me (e.g. it showed as iSCSI adapter, not iSER) Of course, the blog post talks about 'eventually', so who knows. Posted on a vmware and mellanox forum with no help, so I said screw it, sticking with NFS. If I could get RoCE to work, I'd be a happy camper...
 

zxv

The more I C, the less I see.
Sep 10, 2017
156
57
28
I got iSER to work on 6.7u1.

Enabling ECN and PFC on both client and server is a must.
Without both ECN and PFC, it'll work until the moment a bandwidth limit or buffer exhaustion occurs, at which point the iSER connection goes dead and never recovers.

Here's the two packages added to 6.7u1 that appear to work for me:

MLNX-NATIVE-NMLXCLI_1.17.14.2-10EM-670.0.0.8169922.zip
from the "management tools" tab on mellanox's esxi driver page:
http://www.mellanox.com/page/products_dyn?product_family=29&mtag=vmware

MLNX-NATIVE-ESX-ConnectX-4-5_4.17.14.2-1OEM-670.0.0-10134338.zip
https://my.vmware.com/group/vmware/...oadGroup=DT-ESXI67-MELLANOX-NMLX5-CORE-417142
 
  • Like
Reactions: Rand__

zxv

The more I C, the less I see.
Sep 10, 2017
156
57
28
Now you achieve expected performance? Which one are u using for this CX3/Pro/CX4 ?
CX4. I haven't found a way to set ECN on a CX3/CX3Pro on esxi 6.x.
With the above packages, "esxcli mellanox ..." command only works with CX4 and CX5.

Without ECN, the connection goes dead as soon as you reach 90% of full bandwidth on any link.

I've not given up on the CX3 though. This still suggests it's possible:
https://community.mellanox.com/s/ar...abled-lossless-network-on-vmware-esxi-6-5-6-7