[SOLVED]Slow speeds between two Connectx-2 machines

Discussion in 'Networking' started by rubylaser, Mar 1, 2016.

  1. rubylaser

    rubylaser Active Member

    Joined:
    Jan 4, 2013
    Messages:
    839
    Likes Received:
    224
    Hello,

    I have (2) machines connected to each other with Mellanox Connectx-2 Ethernet cards. They are connected via 10m passive twinax cable (I need this length to reach from my office on the ground floor to my server in the basement). The basement server is running Ubuntu 14.04.4 Server on an i5-4590 with 32GB of Ram and the office workstation is running Windows 10 64-bit Professional on an e5-2670 v1 with 64GB of RAM. Here's a few tests with iperf3 between the two machines (I tried turning the Windows firewall off temporarily to see if that helped, but no difference).

    [​IMG]
    [​IMG]

    As, you can see it is SLOW (the bottom picture is the best it's performed, sometimes it's in the single digit MB/s). If I don't use iperf and instead just use Samba to pull a big iso from my server, over the gigabit connection it's flat and completely saturated vs. the 10GBe only going about getting about 35MB/s and being very sporadic.

    I'm new to this, so I'm leaning towards this being the fault of the LONG passive SFP+ cable, so please let me know how I should troubleshoot this, or what I should replace the cable with (is an active cable enough or do I need to try fiber, if so, what do I buy?)

    Thanks!
     
    #1
  2. Stereodude

    Stereodude Active Member

    Joined:
    Feb 21, 2016
    Messages:
    372
    Likes Received:
    54
    Do you have them in a PCIe slots with a x8 electrical connection to them? I've read they don't work well with less.
     
    #2
    rubylaser likes this.
  3. rubylaser

    rubylaser Active Member

    Joined:
    Jan 4, 2013
    Messages:
    839
    Likes Received:
    224
    Thanks for the idea, but yes, one is in a PCIe x16 slot on i5-4590 and one is in a PCIe x8 slot on the e5-2670 system, so that shouldn't be the issue.

    *Edit: Just looked and my i5-4590 system is in a ASRock B85M Pro4, but the second PCIe x16 slot is only running at x4. Luckily, I have a 2P e5-2670 build on the way to replace this system, so maybe that will solve the problem.

    I assumed this was caused by the long passive cable as I read after that fact that for runs longer than 6 meters, it's better to go with fiber or and active cable. Can anyone else weigh in on the cable issue while I wait until Thursday for my new system to show up?
     
    #3
    Last edited: Mar 1, 2016
  4. Rain

    Rain Active Member

    Joined:
    May 13, 2013
    Messages:
    206
    Likes Received:
    67
    Did you install the Mellanox drivers, or are you using the drivers built into Win10? Install WinOF from this page if you haven't: http://www.mellanox.com/page/products_dyn?product_family=32 (WinOF > 5.10 > Windows Client > 10)

    When I was testing ConnectX-2 cards with Win10 I saw very similar performance to yours prior to installing drivers.
     
    #4
    rubylaser likes this.
  5. rubylaser

    rubylaser Active Member

    Joined:
    Jan 4, 2013
    Messages:
    839
    Likes Received:
    224
    I'm using the 4.91 drivers that are built into Windows 10. Thanks, I'll try those tonight :)
     
    #5
    Last edited: Mar 1, 2016
  6. rubylaser

    rubylaser Active Member

    Joined:
    Jan 4, 2013
    Messages:
    839
    Likes Received:
    224
    Update: I installed the new driver and my iperf speeds are now more like > 3Gb/sec. I just tried a transfer via Samba from my server and saw 265MB/s. So, that improved it a lot. I hope when I can get it into an 8x electrical PCIe slot it will improve, because I'm still only about 1/3 of what I should see via iperf. Anymore ideas or if the SFP+ cable may also be a contributor, please let me know (also, what's the best fix).

    The weird thing is with iperf3, if I make the Linux box (the i5-4590) the server, I see the +3Gb/s speeds. If the Windows machine (the e5-2670 v1) is the server, I only get 692 Mb/s. The Windows box as the server is consistently MUCH slower than the other direction?!?!

    Thanks!
     
    #6
    Last edited: Mar 1, 2016
    Rain likes this.
  7. Rain

    Rain Active Member

    Joined:
    May 13, 2013
    Messages:
    206
    Likes Received:
    67
    Tweak the Receive & Transmit Buffers on the Windows machine (Linux should be fine): Right Click adapter in Network and Sharing Center > Properties > Configure... > ... Set them to the maximum. Jumbo Frames shouldn't be necessary to max out a 10Gbps iperf, don't worry about that yet.

    There is also a way to tell the Mellanox drivers your expected workload (or something like that, I forgot what it's called and I don't have these cards plugged into Windows machines anymore -- someone else can chime in) in the network card configuration as well. Toy around with that too. If I'm recalling correctly, I simply set mine to "single port" or something similar to that.
     
    #7
    rubylaser likes this.
  8. PigLover

    PigLover Moderator

    Joined:
    Jan 26, 2011
    Messages:
    2,659
    Likes Received:
    1,041
    I've done more that just "hear" this. First hand testing. The Mellanox cards (CX2/CX3) get REALLY unhappy working in slots with less than x8. This is true even though an PCIe 2.0+ x4 slot should support single port 10GBe with no issues. On x4 electrical my experience is similar to what @rubylaser is seeing - iperf at 2-3Gbps max.

    When you get the cards on x8 slots my money says you will see iperf at 9.5+ Gbps without much tuning required.
     
    #8
    rubylaser likes this.
  9. Stereodude

    Stereodude Active Member

    Joined:
    Feb 21, 2016
    Messages:
    372
    Likes Received:
    54
    According to the Mellanox documentation the CX3 on PCIe 3.0 only "needs" (uses) 4 lanes. On 2.0 it "needs" (uses) 8. That's one of the reasons why I bought one. I haven't used it or tested it yet though.
     
    #9
    Last edited: Mar 1, 2016
    rubylaser likes this.
  10. PigLover

    PigLover Moderator

    Joined:
    Jan 26, 2011
    Messages:
    2,659
    Likes Received:
    1,041
    You may be right - I never did test a CX3 on an PCIe 3.0 x4 slot, only PCIe 2.0. In any case, the OPs motherboard also appears to be PCIe 2.0 x4 (in an x16 slot).
     
    #10
  11. rubylaser

    rubylaser Active Member

    Joined:
    Jan 4, 2013
    Messages:
    839
    Likes Received:
    224
    Thanks for the input. I'll hold off on tweaking anything further until the new parts show up on Thursday. Then, I'll have two 2011 systems with multiple PCIe x8 slots on both sides :) thanks for all the help everyone!
     
    #11
  12. BackupProphet

    BackupProphet Active Member

    Joined:
    Jul 2, 2014
    Messages:
    688
    Likes Received:
    235
    I've tried these cards on Windows. I get similar performance, something between 1gbps up to 4gbps. It varies a lot, file transfer seems to be faster than iperf.
    With Linux or FreeBSD the story is different. Max performance, no tuning needed at all.
     
    #12
  13. izx

    izx Active Member

    Joined:
    Jan 17, 2016
    Messages:
    113
    Likes Received:
    36
    IIRC, the "official" limit for passive DAC is 7m @ 24AWG. 10m @ 24 AWG should be OK, but I don't see any specs for the Belkin cable. Perhaps consider this original Molex cable for not much more? Active DAC will be OK at 10m, fiber isn't necessary.
     
    #13
  14. rubylaser

    rubylaser Active Member

    Joined:
    Jan 4, 2013
    Messages:
    839
    Likes Received:
    224
    Thanks for the advice, but I don't see a link. I'm interested, so please supply it.
     
    #14
  15. izx

    izx Active Member

    Joined:
    Jan 17, 2016
    Messages:
    113
    Likes Received:
    36
    #15
  16. ehfortin

    ehfortin Member

    Joined:
    Nov 1, 2015
    Messages:
    56
    Likes Received:
    5
    I've done similar testing with the same ConnectX-2 in a HP ML310e in appropriate PCIe slot and I'm also experiencing about 3.25 Gbps from OmniOS VM to OmniOS VM on another identical server. Everything is under ESXI 6 and VMXNET3 NIC. I tried from ubuntu VM to ubuntu VM and got the same performance. I've not tried physical to physical yet as I have to shutdown my Vmware cluster and boot some live Linux that has iperf to do the testing. That's my next tests.

    I have 1M long DAC and I'm going through a 10 Gbps switch. I tried to remove the switch and plug back to back and got the same results so, for now, the switch doesn't seem to be the bottleneck.

    Will report once physical testing is done. However, if there are some tweaking that are known to help with VMware, I'm all hear.
     
    #16
  17. groove

    groove Member

    Joined:
    Sep 21, 2011
    Messages:
    46
    Likes Received:
    3
    I'm getting similar performance numbers on ConnectX-2 even on Infiniband 40GBPS. This is between VMWare and Solaris 11.3. I even tried between Solaris 11.3 and Windows 2012 R2 and saw similar results.
     
    #17
  18. ehfortin

    ehfortin Member

    Joined:
    Nov 1, 2015
    Messages:
    56
    Likes Received:
    5
    I was suspicious that I could have an issue with PCIe speed as it was reported that 3.25 Gbps usually means the PCIe slot you are using is not at the minimum speed required to get 10 Gbps. I downloaded the quickspecs and read again. At the beginning of the document, they tell you that there are PCIe 2.0 4X, PCIe 2.0 8X, PCIe 3.0 8X and PCIe 3.0 16X. Well, they refer to the connector width. Looking for the bus width, I discovered that one of the PCIe 2.0 8X is actually a 1X (same for the 4X). So, I moved the NIC to the PCIe 3.0 16X (the 8X was already used by a SAS HBA) and I finally got the 10 Gbps between two hosts having each an OmniOS VM for the test.

    So, make sure you are getting the bus width (or bus speed) in the documentation as it may fool you into using an incorrect PCIe slot.

    Thank you to @PigLover and @Stereodude for pointing at the right direction.
     
    #18
  19. Rain

    Rain Active Member

    Joined:
    May 13, 2013
    Messages:
    206
    Likes Received:
    67
    I too experienced poor performance with Infiniband cards. I believe IPoIB is less performant if your single-core performance is not very good; especially in VMware. When doing VMotions and other network intensive tasks (copies) over IPoIB, one CPU core/thread on the ESXi box would get pegged and performance wasn't that good.

    I assume this is because IPoIB isn't hardware offloaded in the same way (the CPU has to handle the entire IP stack). I'm sure with sufficiently fast CPUs it performs great, but for lower-power parts 10GbE/40GbE cards that support hardware offloading are a must.
     
    #19
  20. rubylaser

    rubylaser Active Member

    Joined:
    Jan 4, 2013
    Messages:
    839
    Likes Received:
    224
    Well something went horribly wrong. I just setup my second 2670 system and put the other card in a x8 slot and the iperf speeds between my hosts are now about 3.5 MB/s (terrible). This has really been a failed experiment so far :(
     
    #20
Similar Threads: [SOLVED]Slow speeds
Forum Title Date
Networking Can't get 10Gbit speeds on Intel X550-T1 Jul 3, 2018
Networking ConnectX-3s Work with 10, 40 and 56 speeds? Oct 10, 2017
Networking Not getting 10gbe Speeds Jul 14, 2017
Networking Help Needed with 10GbE speeds. May 3, 2017
Networking New 10Gbe install, wildly different speeds on similar hardware Feb 4, 2017

Share This Page