Hi,
I have a couple of servers connected via a Mellanox SX6018 - all servers are running ConnectX-3's (non pro)
I am trying to get the holy trinity to work, ESXi, with iSER and iSCSI.
I have done all the prequisite steps, created the iSER adapter, updated both my switch and network modules with configuration for flow control.
But every time I rescan my iSER iSCSI adapter i esxi nothing happens.
Looking into the vmkernel.log I see that some errors are being logged:
my esxi module has not enabled RoCE v2, since it obviously does not suport it
my test machine where I have the iSCSI target:
Linux module parameters
Target configuration
I have tried creating an iSCSI connection from the same machine that is running the target and that works fine.
I have also tried different OS'es to see if it was my CentOS that ESXi did not like, but same issue, ESXi says Protocol not supported by device.
I have been in contact the Mellanox support, and they say that the Non-Pro version of the card should support RoCE v1 just fine and should run in ESXi - but I don't know if the driver in ESXi is somehow gimped because VMWare do not want us to run old hardware - or its something else.
Any ideas from those of you that have had success with iSER and ESXi on ConnectX-3's please speak up
I have a couple of servers connected via a Mellanox SX6018 - all servers are running ConnectX-3's (non pro)
I am trying to get the holy trinity to work, ESXi, with iSER and iSCSI.
I have done all the prequisite steps, created the iSER adapter, updated both my switch and network modules with configuration for flow control.
But every time I rescan my iSER iSCSI adapter i esxi nothing happens.
Looking into the vmkernel.log I see that some errors are being logged:
Code:
2020-05-20T14:52:13.953Z cpu14:2098050)WARNING: rdmaDriver: RDMAGetValidGidType:1896: Protocol not supported by device
2020-05-20T14:52:13.953Z cpu14:2098050)WARNING: rdmaDriver: RDMACM_BindLegacy:3290: Underlying device does not support requested gid/RoCE type. Failed with status: Protocol not supported
Code:
[root@vms1:~] esxcfg-module -g nmlx4_core
nmlx4_core enabled = 1 options = 'enable_rocev2=0'
Code:
[bbs@testnas ~]$ dmesg|grep Mellanox
[ 3.560060] mlx4_core: Mellanox ConnectX core driver v5.0-2.1.8
[ 11.382985] mlx4_en: Mellanox ConnectX HCA Ethernet driver v5.0-2.1.8
[ 11.403233] <mlx4_ib> mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v5.0-2.1.8
[ 28.274089] mlx4_core: Mellanox ConnectX core driver v5.0-2.1.8
[ 34.724035] mlx4_en: Mellanox ConnectX HCA Ethernet driver v5.0-2.1.8
[ 34.762727] <mlx4_ib> mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v5.0-2.1.8
[bbs@testnas ~]$
[bbs@testnas ~]$ dmesg|grep iser
[ 0.000000] ACPI: SSDT 0x00000000EDDE83C0 000573 (v03 HP riser0 00000002 INTL 20030228)
[ 35.432518] iscsi: registered transport (iser)
[bbs@testnas ~]$
Code:
[bbs@testnas ~]$ sudo sh ./shmoduleparam.sh mlx4
mlx4
Module: mlx4_ib
Parameter: dev_assign_str -->
Parameter: en_ecn --> N
Parameter: sm_guid_assign --> 0
Module: ib_core
Parameter: netns_mode --> Y
Parameter: recv_queue_size --> 512
Parameter: roce_v1_noncompat_gid --> Y
Parameter: send_queue_size --> 128
Module: mlx4_en
Parameter: inline_thold --> 104
Parameter: pfcrx --> 3
Parameter: pfctx --> 3
Parameter: udev_dev_port_dev_id --> 0
Parameter: udp_rss --> 1
Module: mlx4_core
Parameter: block_loopback --> 1
Parameter: debug_level --> 1
Parameter: enable_4k_uar --> Y
Parameter: enable_64b_cqe_eqe --> Y
Parameter: enable_qos --> Y
Parameter: enable_sys_tune --> 0
Parameter: enable_vfs_qos --> N
Parameter: fast_drop --> 0
Parameter: high_rate_steer --> 0
Parameter: ingress_parser_mode --> 0
Parameter: internal_err_reset --> 1
Parameter: log_mtts_per_seg --> 0
Parameter: log_num_cq --> 16
Parameter: log_num_mac --> 7
Parameter: log_num_mcg --> 13
Parameter: log_num_mgm_entry_size --> -10
Parameter: log_num_mpt --> 19
Parameter: log_num_mtt --> 21
Parameter: log_num_qp --> 19
Parameter: log_num_srq --> 16
Parameter: log_num_vlan --> 0
Parameter: log_rdmarc_per_qp --> 4
Parameter: mlx4_en_only_mode --> 0
Parameter: msi_x --> 1
Parameter: num_vfs -->
Parameter: port_type_array -->
Parameter: probe_vf -->
Parameter: roce_mode --> 1
Parameter: rr_proto --> 0
Parameter: ud_gid_type -->
Parameter: use_prio --> N
Module: mlx_compat
Parameter: compat_base --> mlnx-ofa_kernel-compat-20200401-1937-5f67178
Parameter: compat_base_tree --> mlnx_ofed/mlnx-ofa_kernel-4.0.git
Parameter: compat_base_tree_version --> 5f67178
Parameter: compat_version --> 5f67178
Code:
[bbs@testnas ~]$ sudo targetcli
targetcli shell version 2.1.fb49
Copyright 2011-2013 by Datera, Inc and others.
For help on commands, type 'help'.
/iscsi> ls * 10
o- iscsi .............................................................................................................. [Targets: 1]
o- iqn.2020-05.root.dom:esxi ........................................................................................... [TPGs: 1]
o- tpg1 .................................................................................................... [gen-acls, no-auth]
o- acls ............................................................................................................ [ACLs: 0]
o- luns ............................................................................................................ [LUNs: 1]
| o- lun0 ......................................................................... [block/esxi (/dev/zd0) (default_tg_pt_gp)]
o- portals ...................................................................................................... [Portals: 1]
o- 0.0.0.0:3260 ..................................................................................................... [iser]
/iscsi>
I have also tried different OS'es to see if it was my CentOS that ESXi did not like, but same issue, ESXi says Protocol not supported by device.
I have been in contact the Mellanox support, and they say that the Non-Pro version of the card should support RoCE v1 just fine and should run in ESXi - but I don't know if the driver in ESXi is somehow gimped because VMWare do not want us to run old hardware - or its something else.
Any ideas from those of you that have had success with iSER and ESXi on ConnectX-3's please speak up