EOS RDMA configuration?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

11notes

New Member
Mar 13, 2022
5
0
1
Does anyone know how to configure RDMA on EOS (Arista)? I've searched everywhere but I do not find a single example or guide or blogpost or anything. I would like to switch my vSAN cluster from ETH to RDMA, the NICs supports it, the switch supports it. I have two Arista DCS-7050QX-32S with firmware 4.26.4. RDMA is enabled in vSphere but of course I’m missing the configuration on the switch side. I’m not a network engineer. Each server has one uplink to each switch. Both switches are connected via a single QSFP+ port. vSphere complains about missing PFC as well as DCB configuration on the switch.

I would even be willing to pay something for anyone who can help me out.
 

i386

Well-Known Member
Mar 18, 2016
4,245
1,546
113
34
Germany
iwarp? or roce?

I've searched everywhere but I do not find a single example or guide or blogpost or anything.
that's because rdma works without any special configuration on the switch (the additional configuration for pfc, dcb etc. makes sure that rdma traffic has the correct priority etc. and no rdma frames are "lost")
vSphere complains about missing PFC as well as DCB configuration on the switch.
that's what you need to google :D
 

11notes

New Member
Mar 13, 2022
5
0
1
RoCE v2, I've seen your link, but as a non-network person this is all gibberish to me.

dcbx application tcp-sctp 860 priority 5

Shows how to configure source port TCP 860 for iSCSI, I would not even know which source port vSAN uses.

When I check a server port (used for vSAN) with dcbx I get this:

show dcbx Et21/1
Ethernet21/1:
No IEEE DCBX TLVs were received
 

i386

Well-Known Member
Mar 18, 2016
4,245
1,546
113
34
Germany
this is all gibberish to me
After reading this I started getting really uncomfortable :D
I would not even know which source port vSAN uses.
There is a lot of documentation (official + third party stuff) about esxi/vsphere and it's components
google your version, find out the ports and then read the arista documentation (and google or ask fpr unknown stuff)
 

11notes

New Member
Mar 13, 2022
5
0
1
No worries. You feel uncomfortable because I’m no expert in PFC or DCBX but guess what, I feel uncomfortable because you are no expert in my field of work? Even a non-expert can make it work. I’ve setup the following on the Arista side:

interface ethernet 1-4,6/1-28/1,29-36
priority-flow-control on
priority-flow-control priority 3 no-drop
dcbx mode ieee
But this was not enough as the NICs did either lack the propper firmware or the propper PFC/DCBX flags or both.

Update Mellanox firmware:
/opt/mellanox/bin/flint -d mt4115_pciconf0 -i /fw-ConnectX4-rel-12_28_2006-MCX414A-BCA_Ax-UEFI-14.21.17-FlexBoot-3.6.102.bin burn
and then change the flags on the NIC to match PFC 3 and DCBX IEEE (CEE is not supported by vSAN):
esxcli system module parameters set -m nmlx5_core -p "pfctx=0x08 pfcrx=0x08 trust_state=2"
esxcli system module parameters set -m nmlx5_rdma -p "dscp_force=26"
esxcli system module parameters set -m nmlx5_rdma -p "pcp_force=-1" # only needed if pcp is set to anything else but disabled
/opt/mellanox/bin/mlxconfig -d mt4115_pciconf0 set LLDP_NB_DCBX_P1=1 LLDP_NB_TX_MODE_P1=2 LLDP_NB_RX_MODE_P1=2 LLDP_NB_DCBX_P2=1 LLDP_NB_TX_MODE_P2=2 LLDP_NB_RX_MODE_P2=2
Not to sound rude but your “solution” to just search the web is pretty useless if there is a forum where you can ask the so called “experts” for help. So maybe do us all a favour and do not respond to a question if you don’t have anything of value to add to it. At least I provided an answer for maybe someone else with the same problem where as you provide zero value.
 

AlbertD007

New Member
Jan 1, 2024
6
0
1
Just replying here because I am researching for a purchase of a 7060 however, I can't see any definitive evidence that it even supports RoCE or RoCEv2.
I can see the switch mentioned in this RoCE Deployment Guide however it's not crystal clear if it supports RoCEv1 and v2: https://www.arista.com/assets/data/pdf/Broadcom-RoCE-Deployment-Guide.pdf

Not sure if you @SWTech68 saw this, maybe helpful?

I personally own a Celestica Seastone DX010 - which definitely does support RoCEv2 however I also have had some challenges to make it work especially since there is even less documentation on getting it to work on a DX010.

Reading this page does make me wonder if it is really supposed to be that easy. I have seen lately there are a number of NIC level settings I need to change as well. https://www.reddit.com/r/vmware/comments/ozhq6j This might be helpful, let me know how you get on. Also really excited to get a 7060. I would also like to know what the power consumption you have for it and the noise levels.