7050QX-32S DCB (Lossless ethernet)

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

dante4

Member
Jul 8, 2021
43
9
8
Hello,

I have 3 servers (actually a few more, but they are connected over LAN segment) and 1 switch 7050QX-32S (as SAN switch)
The servers are used for following:
1st - Ubuntu 22.04 (6.5) + 6xPM9A3 (1.92TB) + SPDK (24.01) to provide NVMe-oF target.
2nd and 3rd are ESXi 7.0U3

All hosts are using ConnectX-3 Pro with RoCEv2 enabled and connected through 7050QX-32S

Right now I'm trying to setup lossless ethernet (DCB).

For ESXi I have found reference for connectX-4, which I have modified for ConnectX-3 Pro like this

Code:
esxcli system module parameters set -m nmlx4_en -p "pfctx=0x08 pfcrx=0x08"
esxcli system module parameters set -m nmlx4_rdma -p "dscp_force=26 pcp_force=3"
esxcli system module parameters set -m nmlx4_core -p "enable_qos=1 enable_rocev2=1"
And from switch side I have followed mellanox guide for Arista, except for DCBX LLDP, since I don't know how to enable it from ESXi side

Code:
platform trident mmu queue profile RoCELosslessProfile
ingress threshold 1/16
egress unicast queue 3 threshold 8
platform trident mmu queue profile RoCELosslessProfile apply
Port config:
Code:
   speed forced 40gfull
   qos trust dscp
   priority-flow-control on
   priority-flow-control priority 3 no-drop
   !
   tx-queue 3
      bandwidth guaranteed 20000000
      random-detect ecn minimum-threshold 150 kbytes maximum-threshold 1500 kbytes max-mark-probability 100 weight 0

Now I don't know how to configure the same PCP, DSCP and QoS from Ubuntu side. May someone help me?
 
Last edited:

dante4

Member
Jul 8, 2021
43
9
8
So far my progress is following:

For Linux -

Code:
touch /etc/modprobe.d/mlx4_en.conf
touch /etc/modprobe.d/mlx4_core.conf
echo "options mlx4_en pfctx=0x08 pfcrx=0x08" >> /etc/modprobe.d/mlx4_en.conf
echo "options mlx4_core roce_mode=2" >> /etc/modprobe.d/mlx4_core.conf
echo "options mlx4_core enable_qos=1" >> /etc/modprobe.d/mlx4_core.conf
mkdir /sys/kernel/config/rdma_cm/mlx4_0/ports/
apt install lldptool 
lldptool -T -i ens160 -V PFC enabled=3
lldptool -T -i ens160d1 -V PFC enabled=3
Seems like it got working for Ubuntu
IEEE DCBX is enabled and active
Last LLDPDU received on Tue May 7 22:31:09 2024
- PFC configuration: willing
capable of bypassing MACsec
supports PFC on up to 8 traffic classes
PFC enabled on priorities: 3
No application priority configuration TLV received

but for ESXi side there is nothing, bruh
 

MountainBofh

Beating my users into submission
Mar 9, 2024
151
123
43
The connectX-4 LX's go on ebay for $30-$35 all day. Maybe worth buying a few?
 

dante4

Member
Jul 8, 2021
43
9
8
The connectX-4 LX's go on ebay for $30-$35 all day. Maybe worth buying a few?
and 25G switch is 600$+, yeah

And for 100G ConnectX-4 they like 200$

Right now I have Arista DCS-7050QX-32S with 40G ports (modded with Noctua to quiet it down) which costed me like 160$
 

dante4

Member
Jul 8, 2021
43
9
8
you can get a MCX455A or CX416A for like 90 bucks and they'll do 40gbe
MCX455A - single port, yeah, great idea to make bottleneck out of thin air. :)
And CX416A 90$? Where? The min what I see is 190$.

Yeah, 570$ (190*3$) sounds like awesome idea for something that cost 45$ (15$ * 3)

Without any respect, but I wasn't really asking how to make ConnectX-4 works, there is a lot of guides in regards of them. Saying "Your hardware is bad, buy new" is really easy way to solve any problem, yeah. Just need unlimited budget.
 

i386

Well-Known Member
Mar 18, 2016
4,266
1,561
113
34
Germany
People are trying to help as good as they can with the provided information...
If dual port is a hard requirement or the hardware is set/there is no budget to change anything write it in the op or mention it somewhere and people won't make "rude" posts/suggestions.
 

fohdeesha

Kaini Industries
Nov 20, 2016
2,762
3,124
113
33
fohdeesha.com
MCX455A - single port, yeah, great idea to make bottleneck out of thin air. :)
Nowhere in this thread did you specify you needed two ports per client

And CX416A 90$? Where? The min what I see is 190$.

Yeah, 570$ (190*3$) sounds like awesome idea for something that cost 45$ (15$ * 3)
Working DCBx costs more than 15 dollars per client. I'm really glad we could help you on this journey of discovery :)
 

dante4

Member
Jul 8, 2021
43
9
8
I have had many issues with connectx3 and rdma that simply was resolved moving to connectx4. I have picked up plenty of 50g cards for 50usd several times for the last few months.
Most likely because it was Connectx-3 non-pro version, since non-pro support only rdma v1, which is obsolete. Good for you, I wasn't able to find 50usd for QSFP 50G.

People are trying to help as good as they can with the provided information...
No, they are not. What I have read from fohdeesha is just - "Just buy newer NIC". Not a single advice of his were helpful.

Also, what exactly information is required for setup dcbx on ESXi and Ubuntu outside of NIC name and OS name + OS version?

mention it somewhere
How should I say it. Let's say you have 2 water filters which are connected to your water system. And you use 2 of them. Otherwise why you would need 2 of them? But they got clogged.
You go to forum and ask - "How to clean them?". And answer is "Just buy one filter, but newer, lol". Not really helpful, don't you think so?
 

dante4

Member
Jul 8, 2021
43
9
8
Nowhere in this thread did you specify you needed two ports per client
Welp, degrading speed of the system is always great idea. I trough that it's basic logic if you propose to buy something - it should be at least not worse than current solution.
 

mach3.2

Active Member
Feb 7, 2022
138
95
28
If telling you "a car won't fly like a plane" is considered "unhelpful", then I guess they should all just move on and leave you to your own devices.

and didn't fohdeesha linked you a 40GbE cx4 for 90 bucks in the same post you quoted? Not as good as $15/card but it's definitely ahead of $200/card...
 

klui

Well-Known Member
Feb 3, 2019
852
471
63
MCX455A - single port, yeah, great idea to make bottleneck out of thin air. :)
CX4s are PCIe gen 3 and the maximum bandwidth available is only 126Gb. Having 2 ports will only allow you to have failover not use both of them at the same time.

This thread reminds me of
  • Good
  • Fast
  • Cheap
Pick any 2.
 

Mithril

Active Member
Sep 13, 2019
381
122
43
CX4s are PCIe gen 3 and the maximum bandwidth available is only 126Gb. Having 2 ports will only allow you to have failover not use both of them at the same time.

This thread reminds me of
  • Good
  • Fast
  • Cheap
Pick any 2.
My friend the MCX416A-CCAT is a Gen3x16 card.

I've read the thread and I'm completely missing where OP managed to even imply much less spell out that dual port was needed, the best I could infer was a reasonable guess of "at least as much bandwidth as currently". And it sounds like people are saying "those features simply do not work in the older cards" which would suck for OP but if that is that case, thats how things are.

OP also seems to be trying to A) Get ESXi to do what they want and B) get ESXi to play nice with "older" hardware. C) it is being implied there is a need for dual port for bandwidth and not failover between switch and clients.
In my experience these are all ways to go through a costco sized bottle of headache pills in a weekend...
 
  • Like
Reactions: blunden and mach3.2

mach3.2

Active Member
Feb 7, 2022
138
95
28
B) get ESXi to play nice with "older" hardware.
lol.

I digress; SR-IOV doesn't even work with the native mlx4 drivers on esxi6.7/7 despite working on the old linux based drivers. The writing was long on the wall for CX3 era of cards, especially if you're working with vSphere.