Mellanox Connectx-2 NvmeoF

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Ryan Knight

New Member
Feb 12, 2018
3
0
1
34
Hello all,

I've been a bit of a lurker here for the past few years but finally decided I had a question worth posing. Forgive me if this is in the wrong section as it's both a bit of a hardware question as well as a software one. To set the stage a bit, I have two 40Gbps mellanox connectx-2 cards, one is in my main workstation while the other is in my dell r820 which is housing a pair of samsung 960 evos in raid 0 (yes, I know this is not redundant. It's only being used as a temporary cache). I've gone through the entire setup and configured the cards with the appropriate firmaware and have them talking to each other, I've set up IPoIB and have gotten good results testing the speed of the remote connection which are well above what I would get over standard sata 3 over a gigabit ethernet connection, but nowhere near as fast as what I see locally on the r820. Just to give a quick example over the IPoIB connection I get 2802.6MB/s Reads and 3044.3MB/s writes; locally I hit 5244.5MB/s and 3785.0MB/s respectively.

So the results are pretty much what I would have expected over the 10Gbps IPoIB link, which is a relatively simple setup. My question is since I'm using IPoIB I'm not getting the full benefit of having a 40Gbps card, and have recently been reading a bit about NVMEoF, has anyone here had experience with setting this up? I know another method would be iSCSI over RDMA, but I have no experience with that.

Hope to hear from the community soon! Thanks!

Ryan
 

i386

Well-Known Member
Mar 18, 2016
4,245
1,546
113
34
Germany
QDR = 40gbit bandwidth = 32gbit throuhput (=bandwidth - encoding overhead 20%) = 3.2 gbyte/s
You are bottlenecked by qdr infiniband speed. Overcoming this limit requires other nics (40gbe or fdr infiniband or faster networking)
 
  • Like
Reactions: T_Minus

Ryan Knight

New Member
Feb 12, 2018
3
0
1
34
QDR = 40gbit bandwidth = 32gbit throuhput (=bandwidth - encoding overhead 20%) = 3.2 gbyte/s
You are bottlenecked by qdr infiniband speed. Overcoming this limit requires other nics (40gbe or fdr infiniband or faster networking)
Hello i386,

Thanks for the quick response, so if I'm understanding correctly my connection is in fact at 40Gbps and not the 10Gbps I thought I was limited to by the IPoIB connection? Since the cards I have are dual port cards, is there a way of bonding the connection to achieve 80Gbps overall throughput?

All the best,

Ryan
 

_alex

Active Member
Jan 28, 2016
866
97
28
Bavaria / Germany
you would be limited by pcie 2.0 with the cx2 in the first place then.
with multiple hca you could export one SSD per hca and then Raid-0 on the initiator side. imho, srp transport via scst is always worth to consider over anything ethernet based.
 
  • Like
Reactions: T_Minus

trippehh

New Member
Oct 29, 2015
17
3
3
43
2.8-3GB/s (~24Gbps) sounds about right for this generation of cards that use PCIe 2.0 with 8 lanes.
The ConnectX-3 cards top out at about 6GB/s (~50Gbps) with both 40G ports pushing data using PCIe 3.0 with 8 lanes.

PCIe 4.0/5.0 can't come soon enough :)
 

Ryan Knight

New Member
Feb 12, 2018
3
0
1
34
Interesting, I was under the impression that this card was capable of 40Gbps speeds, of course a bit lower with overhead. So why market them at that speed if they can’t hit that point due to the interface?
 

trippehh

New Member
Oct 29, 2015
17
3
3
43
Technically the cards tend to handle their rated speeds fine (mostly!), it just depends on your use case -- not all use cases require you to copy all the data over PCIe. I have not looked that closely at the marketing materials, but they are probably in the clear if they only advertise link speeds or attach some disclaimers.

You can for example handle DDoS in excess of PCIe bandwith by enabling ASIC level packet filtering. Or monitor video feeds by only reading the parts of the streams that is required to tell if a feed is fine. I know someone doing well in excess of 100Gbps this way. Or do random sampling for whatever other monitoring purpose.
 

i386

Well-Known Member
Mar 18, 2016
4,245
1,546
113
34
Germany
Interesting, I was under the impression that this card was capable of 40Gbps speeds, of course a bit lower with overhead. So why market them at that speed if they can’t hit that point due to the interface?
You mean the dual port config? It's used for high avalability/redundancy in case that a switch dies, not for maximizing throughput.
 

trippehh

New Member
Oct 29, 2015
17
3
3
43
You mean the dual port config? It's used for high avalability/redundancy in case that a switch dies, not for maximizing throughput.
For the PCIe 2.0 ConnectX-2 they do not even have enough PCIe bandwidth to saturate a single port. Still, beats using 10G ports. ;)
 

i386

Well-Known Member
Mar 18, 2016
4,245
1,546
113
34
Germany
oO
For one port it should be enough and the values you've posted in the op look like what I would expect.
 

chief

New Member
Mar 9, 2023
1
0
1
Hi
Sure you may setup round robing for 2 ports for ibsrp
Please describe what is your target and initiator and I'll try to help you