Mellanox ConnectX-3 with fiber requires physical transceiver replug to reconnect

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

unphased

Active Member
Jun 9, 2022
148
26
28
I have the following equipment:
My goal is to enable high speed network access to storage and compute resources on my workstation from different locations in the house. In particular one fiber terminating to a desk to use with the macbook, and one fiber terminating at the 4K OLED TV to use with the windows box.

So far I have had great success with connectivity and speed, with this very cheap set of fiber transceivers and fiber. I am glad to report great success both last year with Windows (I could pack the MCX353A card, an ultra low profile card, with its bracket removed, into the SFF enclosure. When not using that, the x4 pcie, itself adapted from an m.2 slot, adapts back into an m.2 slot.) and more recently with macOS on my AppleSilicon macbook, achieving 26 and 20Gbps respectively, it will be good enough for me to be satisfied for the better part of a decade, a performance capability which will not be threatened by other technologies until Wifi 8 and likely beyond.

However! One very significant hurdle remains. When the connection from windows or macOS goes down for a long enough time, and I do not have a handle on what this amount of time is, but I feel like it is between 30 minutes and 5 hours, the fiber link cannot come back up again until I do the very specific thing of replugging the transceiver from the linux side of the connection. Yes. Replug the transceiver. The following do NOT restore the connection:

- rebooting the other computer
- replugging transceiver in the other computer
- replugging the fiber into the transceiver in the other computer
- replugging the fiber into the transceiver in the linux computer
- rebooting the linux computer
- disabling the pcie device and bringing it back with rescan via sysfs (echo 1 > /sys/bus/pci/device/0000:0e:00.0/remove; sleep 1; echo 1 > /sys/bus/pci/rescan as root)

As you might imagine I cannot justify spending all of this effort to set all this stuff up if i have to repeatedly every day go back downstairs and reach down under the desk to unplug and replug the transceiver. Neither will it make sense to try to make a solenoid or servo contraption to automate that, although that would work, would be doable (since there is that nice pull mechanism; doing it with RJ-45 would be impractically difficult by comparison...) and be really frickin' cool, it would certainly wear out the mechanism after some time. Although I might be able to convince myself that with the transceiver and the CX3 card both in the neighborhood of $30 I actually could afford such wear and tear, I will reserve this solution as a last resort approach. But I WILL GO TO THOSE LENGTHS IF NECESSARY.

Does anyone know how to make progress into this issue?

- Are the mlx4 drivers closed source?
- Do we think the transceiver might be to blame for this weird "power saving" behavior? How might I confirm this?

I will try to do some further testing to compare behavior between the fiber connection with which I reliably encounter this problem, and doing the same with a DAC. In such a test the transceiver and optical fiber part is the only part that changes. I think if I do not experience this behavior (e.g. with a DAC the connection spins right back up after leaving it offline overnight) then it would give me more faith to explore other fiber transceiver combinations.

- If i place the CX4 card into linux and use that instead, I think behavior would likely be different, it seems like mlx5 driver would be used instead of mlx4 (for cx3) and I'm not sure that this would test anything other than possibly confirm that i could get more CX4 cards to use on the linux side. But I'd rather not, if CX3 cards suffice as I have so many already. It won't hurt but I don't need to spend $150 on dual port CX4 cards yet. I will consider it if it will be the only way to resolve this problem though, and it seems like it will still be cheaper than switching fiber tech to LC from MPO.

So far it would appear that the most appealing avenues to explore are:

- other MPO transceivers???
- switch to CX4 on linux side. Can get by with a single CX4 dual port card to wire up my mac and windows connections. Other linux to linux server connections I can get by fine on CX3's if they run 24/7.j
- possibly give up; run the windows box 24/7 and use workaround of plugging in CX4 USB-C connection to iMac at same desk when not in use with macbook. I am testing this now too and it does work but causes the iMac to disable all USB devices because of "drawing too much power". But nothing is failing or shut down yet. But it does indicate something is not great about power delivery to connect it in this way. I could also use my iPad to keep the fiber connection alive. But again that's just not going to work if i want to bring both ipad and macbook around the house... Oh nice. that brought me to one more idea. I can also bring my Steam Deck into this rotation and use it to keep the connection alive. haha. (Edit: Tested and yeah no, Steam Deck has no thunderbolt support. It just charges from that cable.)
 
Last edited:

klui

Well-Known Member
Feb 3, 2019
844
463
63
Looks like this is one of the things one pays for not using a switch.

Check out the logs on the Ubuntu machine when you physically unplug and replug the transceiver and compare that to when you do it using sysfs. If they are the same then there is a bug in the mlx4 driver, since it happens on Windows or MacOS clients. I would try it on a CX4 to see if it fixes it. You don't need to buy 100G, one of those 40G PCie x8 cards will do since they're coming down in price, $40 for single-port.

You can also try connecting Windows and MacOS and see if either trips up the other. Did you try powering off the Linux box instead of rebooting? If that works, and if the box supports AMT/IPMI you can power cycle the host.

You should get some other transceivers, say Mellanox, and another short MPO patch cord and isolate potentially bad/marginal parts.
 

unphased

Active Member
Jun 9, 2022
148
26
28
Yeah a switch would keep the connection alive and it stands to reason that a device like this was designed from the get-go expecting to interface to a switch.

I did a bit of research last year into 40Gb switches but it seemed like I would have to do some time consuming careful shopping for suitable, possibly rare, units that are amenable to modification in order to reduce their noise profile. I was reading some topics in which folks were able to hack 140mm fans and such into their enclosures and keep them at sane noise levels. Given that this is my expectation, a significant amount of power draw would be involved with such devices.

So then what I had been thinking is for the implementation of a switch, then, why not just use a PC enclosure and stack several dual slot CX3 cards in it? Should be able to use a consumer platform and 4 PCIe lanes to each card to host 6 cards for 12 QSFP+ ports this way, while allowing for a great deal of flexibility on pretty much all parameters, a few hundred bucks for a "switch" that can handle a lot more than switching and I could put a bunch of CX3 and some CX4 100Gb cards inside (though that would use up more lanes), and probably it might be able to perform better than a real switch especially if I load it up with RAM.

In fact I have a TR 1950X system with 96GB ECC UDIMMs in it collecting dust, that may be ideal to use for this sort of thing and I could fill the rest of the lanes with NVMe. I just don't like its >100W idle power consumption. And even though ~$70 per 100Gb port is a bit pricey at this time I sure as hell won't be able to get any real 100Gb switches for any cheaper and I dont need to actually implement that until I have a need for it.

I guess the issue is just that I'll only ever have either 3 or 4 devices to connect at high speed in the house, they can be connected effectively with direct connections, and a hundred watts consumed by a switch just seems wasteful.
 
Last edited:

unphased

Active Member
Jun 9, 2022
148
26
28
Power cycling the linux box is out of the question I think because I'm kind of trying to reduce unnecessary power cycles on the 12 rust spinners inside. If a soft reboot isn't addressing the issue then that is a bit of a dead end there. I wouldn't be satisfied with having to daily (or more frequently) reboot for this reason either.
 
Last edited:

i386

Well-Known Member
Mar 18, 2016
4,250
1,548
113
34
Germany
Is the card/ports in vpi/auto mode?
If yes what happens when you set them to ethernet only?
 

unphased

Active Member
Jun 9, 2022
148
26
28
Is the card/ports in vpi/auto mode?
If yes what happens when you set them to ethernet only?
These CX3 cards seem to reset to VPI mode. At any rate, I tried this and it did not improve behavior.

I did however have to use the linux box to set my MCX455A-ECAT card to EN (or was it VPI? hmm) mode before it would work in the thunderbolt enclosure however.
 

DavidWJohnston

Active Member
Sep 30, 2020
242
191
43
I had a problem similar to this. On my HTPC I have an old QLogic 10G card connected through a QSFP28 adapter into a Celestica DX010.

After being up for a while with no traffic, the link would drop. Every time I went into the living room to watch a show it would be down. I never did figure out exactly why. I had to do a reboot to fix it.

My solution was to create a scheduled task in Windows on the HTPC that keeps pinging stuff constantly. I'm not super happy about it, but now it always stays up.
 

unphased

Active Member
Jun 9, 2022
148
26
28
Update. I started testing with a windows machine with the MCX353 card with a DAC, so the linux box's MCX354 now has both ports populated (one with this DAC and one with the MPO transceiver going to the mac setup). I left both disconnected (macbook unplugged, and the Windows box in hibernate) for about 8 hours. I come back today and plug in the USB-C to the macbook in the afternoon, and it actually started working...

So I am not sure what to make of this.

I also note with this en-us_windows_11_consumer_editions_version_22h2_updated_sep_2022_×64_dvd_f408dad5.iso (I installed windows 11 pro for Workstations, which may have something to do with it) the CX3 card worked on windows connecting at 40Gbit out of the box without me performing any driver installation. So I found this quite cool, though I haven't gone digging to see if it'll even let me set 9000MTU.

I suppose I will need to do more testing and will proceed to try to use these things. Really hope I can get some mileage out of them for once.

Other random notes include I'm really liking these PiKVMs. I have just the one in a box and it's been really handy manipulating machines down in my basement from upstairs including on the phone. I was able to order some Pi CM4s as well the other day, and I have a (kind of expensive at $130 but oh well) Geekworm x652 piKVM coming that takes a CM4 and NVMe, and which fits neatly in a single PCI slot. Tempted to get another pikvm but 2 of them should be enough to hold me over for a bit. It's tempting to get a separate KVM switch to enable one of these to control multiple machines, but I can't help but feel like it would be more practical to just have multiple independent pikvm's...
 
Last edited:

unphased

Active Member
Jun 9, 2022
148
26
28
Update....

I have set up a Mellanox SX6036 switch, and was able to figure out the licensing to obtain ethernet functionality. However today after leaving my macbook unplugged from it it is doing the same thing of not renegotiating the connection until I replug the transceiver at the switch side.

Of course there is probably way more stuff I could do with this switch to get it to re-detect, but it's definitely somewhat disappointing. That said, the behavior from before with the direct connection did change a bit over time for some reason, so I will see how it goes and play it by ear.

Do also have other cheap transceivers i ordered inbound but i'm definitely not going to be running a different fiber cable to this desk to run those if i can help it.
 
Last edited:

unphased

Active Member
Jun 9, 2022
148
26
28
Another update.

I have tried two approaches and neither work:

- shutting down (disabling) and re-enabling the port on the switch console (via 'enable', then i think 'configure terminal', then 'interface ethernet 1/6' to select the port, then 'shutdown', then 'no shutdown' to bring port back up)
- rebooting the switch!

The behavior is just like before! Requires transceiver re-plug on the other side to bring the line back up. Except the behavior from before with the direct connection to the CX3 card running Linux somehow improved to where it successfully reconnects after 8 hours or a day, but still failed after 3 days. Now it's failing after one day here (again).

I can try one more thing without new equipment which is swap these transceivers around, haha. I can also get more of these type of MPO SR4 transceivers to try, before I go switch from MPO fiber to something else.
 
Last edited:

prdtabim

Active Member
Jan 29, 2022
173
67
28
Another update.

I have tried two approaches and neither work:

- shutting down (disabling) and re-enabling the port on the switch console (via 'enable', then i think 'configure terminal', then 'interface ethernet 1/6' to select the port, then 'shutdown', then 'no shutdown' to bring port back up)
- rebooting the switch!

The behavior is just like before! Requires transceiver re-plug on the other side to bring the line back up. Except the behavior from before with the direct connection to the CX3 card running Linux somehow improved to where it successfully reconnects after 8 hours or a day, but still failed after 3 days. Now it's failing after one day here (again).

I can try one more thing without new equipment which is swap these transceivers around, haha. I can also get more of these type of MPO SR4 transceivers to try, before I go switch from MPO fiber to something else.
As a workwaround... try using ethtool to force a test in the port. This will disable and re enable the link during the test.
Code:
ethtool -t /dev/enp40s0f0
 

unphased

Active Member
Jun 9, 2022
148
26
28
Thank you for the suggestion. I will try it but I just noticed that one capacitor has clearly been knocked off of my ConnectX-4 card...

I tried measuring the other SMD capacitor in the bank next to it with my multimeter but it reads 230uF which sounds a bit high but maybe this is not unreasonable for a low voltage part... Not sure how to repair it at the moment but it should not be too troublesome.

Meanwhile I did plug it back in and it is still functioning. So the next time it stops functioning i will try that. However I do not have ethtool on macOS. I found this GitHub - sigma-1/ethtool: Ethtool for Darwin/OS X but I dunno if it makes sense... Were you suggesting to run it on the switch? I still need to set up a license to get to a root shell on the switch I think but it should be straightforward. I can SSH in, but it just drops me into the console interface.

Anyway I would not be surprised if some of this problematic behavior that I'm describing could be related to missing this capacitor...
 
Last edited: