RoCE v1 implementation (SX6036 heatsink/silence mod running log!)

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

unphased

Active Member
Jun 9, 2022
148
26
28
More progress. I was able to stumble on how to use the -o flags for genlicense. Gotta specify all the Ethernet things with those flags. And then ltrace on dumplicense with the output allows for a valid license output to be seen.

This process is incredibly convoluted so it's enough of a gatekeeper as it is. I reckon that was enough hints!

Then it took a while to find the page to switch the profile for the switch over to VPI. Once I did that, it did a reboot and then a much longer wait on the next boot, the serial console says stuff like:

Code:
    System is initializing!
This may take a few minutes


    Modules are being configured
1 module remains to be configured
I think I see the switch name also reverted back.
 

unphased

Active Member
Jun 9, 2022
148
26
28
I have my macbook's CX4 thunderbolt connection online with the switch with 40Gbit Ethernet over my LR4 MPO connection!

I have a 1Gbit RJ45 to SFP+ transceiver and a SFP+ to QSFP+ adapter (QSA?), but with this plugged into the switch it is not making a 1Gbit connection to my home LAN. If I can get this going somehow then I will have DHCP served by my router.

Which incidentally needs overhauling because the webUI of my Netgear router is down, even though its wifi and ethernet are continuing to function.

Update: OK manual assignment of my LAN uplink port to 1Gbit makes it show up green, this is encouraging...

Yes. macOS now gets DHCP... alright now I have full speed internet via thunderbolt from here, perfect. I might be nearly done with this then at this point since other 40Gbit links should just work.

Now I did have to manually switch some ports over to ethernet from IB. Probably I could change the profile of the switch to ethernet and not bother with this, or further figure out how to make VPI work more conveniently, but I don't think I care about stuff like that at this point. Very exciting.
 
Last edited:

unphased

Active Member
Jun 9, 2022
148
26
28
OK... so this is pretty great, I have 3 machines all on my home DHCP with this switch. I just have one problem, my port 2 (macbook on cx4 in thunderbolt enclosure) and port 3 (5950x rig with cx3) can communicate, however the data path from port 3 to port 2 appears to be blocked. ssh connection just hangs when the traffic flows that way. And iperf3 shows 0.00 bits/second going in that one specific direction.

It's very bizarre. Must be some configuration somewhere. Unclear how to troubleshoot.

Update!: OK great, this was fixed once I went in and cleared out the manually set 2034 MTU in the macOS network settings for this adapter. Switch wasn't liking that apparently. Or probably more like need to configure it specifically on the switch, which I definitely probably want to do to get the better speed.

Everything is working. This is really awesome.

I realized my license I generated doesn't include the Fabric Inspector. I do not know what that offers but I reckon I will switch it on next. (Edit: Hm this might be an IB specific thing. also the genlicense help doesn't say anything about it. So I'm gonna ignore it.

OK next thing is to get the fiber cleaner ordered I guess. I have one more 40Gbit node I could bring up lol, and I could do speed tests at full 40Gbit between the 5600G and 1950X systems.
 
Last edited:
  • Like
Reactions: erock

erock

Member
Jul 19, 2023
84
17
8
Yes. This is why I am posting, to share and empower others like us! @erock what is your method for connecting your cluster without a switch? Do you run a daisy chain network? I planned to do that for the longest time as you can see in my first post here at least with my computers, but, I realized making a switch is the only practical way to approach it... proper routing/switching/bridging, and especially trying to do that at huge speeds even ignoring CPU overheads, is likely to be difficult/impractical/silly in ways that I never got close to being able to estimate. The thought of maxing out CPU on every computer in the chain of communication is frankly preposterous, haha.

I have found the megathread about conversion of EMC switches into running MLNX_OS which certainly looks like a journey. But I am not starting behind like that, I have a bona fide mellanox switch here. I am being super careful right now in terms of editing things willy nilly in the CLI right now, but it does feel very nice to have the serial console up and running.

Still poking around atm to work out how to enable ipv4 DHCP on management interfaces. I did find the initial wizard workflow and ran it and told it to turn on DHCP and gave it an ip but it does not seem to be working. I even found a CLI command to reboot the switch and that worked too. Still no ip address getting assigned
The no-switch approach for a small cluster was an idea that I got from @gb00s in this post. For the 3 node setup each machine had dual port ConnectX-3 cards connected via DAC’s. The ports on each card got the same IP but I used routes with a different subnet mask to guide traffic from each port to other nodes. So for example, IP for nodes 1, 2 and 3: 10.12.12.1/24, 10.12.12.2/24, and 10.12.12.3/24. To setup connections for node 1 to 2, I used a route with IP=10.12.12.2/32 with gateway 10.12.12.1. To setup connections for node 1 to 3, I used a route with IP=10.12.12.3/32 with gateway 10.12.12.1. This worked in Ethernet mode for 3 nodes composed of dual socket H11DSi with two 16 core 7f52 CPUs. I now am giving RoCE a go over 4 nodes each with 2 dual socket CX3. The cards are so cheap now (and I had some extra DACs) I wanted to give this a try for my own education. Once I get the 4 node setup complete and MPI installed with ucx I will share performance. I plan on moving to a switch in the future but decided to cut my teeth on this simpler approach while I learn more about which switch to buy for my CX3 cards and Roce mode (all the talk about needing a special license for Ethernet mode made me hesitate). I am looking forward to learning from your experience!
 

unphased

Active Member
Jun 9, 2022
148
26
28
Awesome, yes, a daisy chain can be closed into a loop with 3 nodes for a clique of size 3 with direct connections on each with two ports on each node. for 3 total connections. With 4 nodes, 3 connections per node is needed to maintain direct connections, with 6 total connections, and for 5 nodes that will increase to having 4 connections per node, filling two cards on each, and starting to have way too many direct connections with a total of 10 there. So I definitely see why you said up to 5, for practical purposes going up to 10 total connections with 4 ports populated in each node is certainly the reasonable cutoff for "it doesn't make sense to continue this any further" ha!!
 
  • Like
Reactions: erock

erock

Member
Jul 19, 2023
84
17
8
Awesome, yes, a daisy chain can be closed into a loop with 3 nodes for a clique of size 3 with direct connections on each with two ports on each node. for 3 total connections. With 4 nodes, 3 connections per node is needed to maintain direct connections, with 6 total connections, and for 5 nodes that will increase to having 4 connections per node, filling two cards on each, and starting to have way too many direct connections with a total of 10 there. So I definitely see why you said up to 5, for practical purposes going up to 10 total connections with 4 ports populated in each node is certainly the reasonable cutoff for "it doesn't make sense to continue this any further" ha!!
Yes, I bet any integrated benefit peaks at 3-4 nodes after which a switch makes sense in terms of performance and cost. FYI, your post inspired me to get a SX6036. We’ll see if it is worth the effort for my use case. But cost is so low so it should be worth it for the learning experience.
 

unphased

Active Member
Jun 9, 2022
148
26
28
Yeah it definitely works. I think I just have some jank in my other compatible equipment which was all also rock bottom dirt cheap... right now trying to still get CWDM4 transceivers working on the 6036 but i think i recall David mentioning doing something to enable them at 40Gbit. i had them working NIC to NIC earlier though!
 
  • Like
Reactions: erock

unphased

Active Member
Jun 9, 2022
148
26
28
FWIW the KAIAM 40G QSFP+ LRL4 transcievers (these are 8 dollars each and work with very cheap single mode OS2 fiber) work great. getting decent speeds, zero retries. I did clean with the one click fiber cleaner.

I can also confirm that these do not work in ports other than 1, 3, 33, 35 which are the high power ports on this switch. I did have to do some LR4 enablement command to enable these high power ports. Doing that still did not enable the CWDM4's in my testing. These LRL4 are clearly high power ones but actually work.

Looks like the 100Gb CWDM4's may not work with the SX6036 which is not surprising. I don't really have much interest in doing further experimentation to coerce the CWDM4's into 40Gbit mode or whatever. But I can confirm that they do appear to at least kind of work at 40Gbit in a NIC to NIC configuration, even though that didn't seem to go so well (it was before I got the 1 click cleaner and I got a link as slow as a few hundred megabits)
 
Last edited:

unphased

Active Member
Jun 9, 2022
148
26
28
I'll bring my updates back to my own thread here. The process of silencing down the SX6036 is underway now.

  • I ordered some Delta FFB0412VHN fans from eBay and received them. They turned out to be even louder and more powerful than the FFB0412SHN fans found to come with the switch. I am not sure what to chalk this up to; either the original fans have degraded (they indeed pull only 0.2A when they are rated .45 or .6 depending how you read its label), or my new fans are counterfeit higher power fans (supposed to be 9kRPM fans! But run at 15k!), or both. Either way these VHNs seem like good value as they are proper blowymatrons. It is just not suitable for this project as they run at 15krpm and i'll want them nearer to 7krpm. They respond well to PWM control as tested in a regular PC.
  • I ordered some Sunon PMD1204PQB1-A fans from AliExpress. also 4 pin and these ones actually look like they have the right header as used on my switch although the pinout seems questionable but I shoudl be able to swap the wires around. No clue when they will arrive. I also did not order enough of these haha. But at 2.8W rating i wonder if they will also end up way too loud as i can see the other fan I got consumes around 3W and simply rips at 15kRPM. Will see once they get here.
  • I ordered some Arctic 6kRPM fans, they come in a 5 pack shipped in one day from Amazon which was nice. These are indeed silent and I worry a bit that they might be too weak.
  • From monitoring temperatures, I see that the CPU reaches over 60C, even when I have the second PSU inserted. I opened up the switch which was easier than i thought (6 external screws on top cover), remove 4 screws on the PPC module card thing (the controller computer of the switch, it is purportedly a 5W CPU), and the module pops out easily. I find a positively anemic 10mm tall aluminum maybe 32x32mm or so cooler with 25 small cylindrical posts on it for surface area. The thermal paste I found looked to be of decent quality but probably a good .4mm of it in there that I wiped off. 60C definitely sounds about right... The mounting hole spacing on this setup is 40x40mm. Therefore the spacing center to center of the two heatsink mounting holes is 40*sqrt(2) = 56.57mm. Nothing I have in my parts box matches this spacing.
  • I think about it a bit, and decide to order a 40x100x10mm copper skiving fin heatsink from amazon for $20. It's clearly overkill, but it's not a bad price for a good chunk of copper that will probably cool something like nearly 10 times better than the original heatsink on here. I was originally thinking about keeping it attached to this card by using some silicone rubber bands or similar jank, since the geometry would work out for that, but then I realized that drilling the holes in the right spots would be a much better way to go about it, and will be really easy to do, so this is my plan.
  • I could come up with a way (there is plenty of space in there) to mount a 7th 40mm fan placed directly in front of this CPU to add more airflow there. But I think the heatsink upgrade alone will let me run 4kRPM on all fans (in this case I hope using these Arctic 6kRPM ones will not lead to RPM cycling behavior) and stay cool enough.
So far I am pretty confident that I will be able to get this thing basically silent without compromising on anything or necessarily having to work out fan control configuration, beyond the use of a command, or possibly disconnecting PWM wires, to force full duty cycle for preventing PWM command from dropping any fans below 4kRPM to lead to RPM cycling. Yeah I might go ahead and just wire up 3 wires (RPM (yellow), red and black) to set up my Arctic fans as I already know they are quiet enough at full speed. So i'll probably yolo these fans in without PWM so i'll only need to solder 18 more wires instead of 24 while i install the heatsink. The main wrinkle right now is that I only have 5 of these Arctic fans and need either 6 or 7. I will most likely rig up one of these other "VHN" fans that I have with a high wattage resistor network to lower its speed.
 

Attachments

Last edited:

unphased

Active Member
Jun 9, 2022
148
26
28
This may be one of the jankiest things I've done in recent memory.

I took the heatsink, which I realized once I got it that the 40mm dimension isnt wide enough for me to make the holes if I mount it straight, so I had to mount it crooked. I found a way to orient it barely clearing all the other tall components, and it does not overhang more than a few mm so it still fits.

Then I got the hole positions and drilled 3.2mm holes up from the bottom and drilled down with like a 3/8 in bit from the top to make room for the retention pins. It looks like an atrocity, poor heatsink, anyhow I'm pretty sure I got all the copper swarf out. Wiped down the middle section with a bit of alcohol...

I put a dollop of MX4 in the middle of the chip and put it on, the pins went in easy.

Fired up the switch. 49C! Goddamn! I was hoping for a much better improvement than this. According to the chart, it seems like within 10 minutes or however long it took to boot up that it already reached this as a steady state temperature.

This is bugging me a lot because this has got to be an overkill heatsink... Maybe I need to really make an effort to make the thermal paste layer thinner... I might pop it open and push on it a bit. I'll play around with different stuff with the enclosure open to see how it responds but at this point a 7th fan in close proximity to this thing is looking much more likely now.

Anyway I didnt break anything yet, so it is time to start quieting down the fans...
 

Attachments

unphased

Active Member
Jun 9, 2022
148
26
28
Popped open the cover while running and this copper heatsink isnt even warm to the touch so the temp reading is ... maybe not the CPU chip? Maybe the CPU package has poor thermal properties.

Something seems slightly fishy but I might try something silly like propping some cardboard into there to redirect airflow AWAY from components and see how hot they can ever get. I do suspect if I remove the fan module that the whole thing will shut down in self preservation so this will be a good way to test how it responds to reduced airflow. I read the documentation, no it's meant to be hot swappable so yes i could yank the fan module. When I do that, though, it cranks the PSU's fans to max.

Also tell ya what, though, this thing is really loud. It's definitely getting to me now. I wasn't feeling it earlier, apparently I was deaf from the novelty of it. So I do not know how I am supposed to interpret this CPU_BOARD_MONITOR2 temp reading of 50C but the heatsink I just put on there sure as heck isn't getting that warm. it feels like maybe 38C. I guess it's plausible anyhow. Just gonna push forward. Fairly confident that I can have it basically silent. I'm gonna plug the second PSU in but not power it on since i only have 5 fans I can replace at the moment. I will be able to do the fan swaps without shutting this thing down and it will be a good thermal stress test.
 
Last edited:

unphased

Active Member
Jun 9, 2022
148
26
28
Oh for sure. With:

- top cover removed
- 4 fans in module removed
- Arctic 6kRPM fan installed in power supply feeding the side with the CPU & MGMT board, running 6kRPM
- other power supply removed

basically what amounts to no airflow.

I get 62C on the CPU_BOARD_MONITOR2 reading with my new heatsink setup. So this thing will stay under 60C with these quiet fans. I could probably even go noctua in this. But I won't, because that would be a bit spendy. I did buy a few too many fans for this but it is ok.

Haha I am going to solder all 4 wires instead of leaving the PWM wire out like i planned before, because I might even want to try to push these lower than 6kRPM.

Update: Including the blue PWM wire causes RPM cycling due to the fan module fans dropping below 4000 or so RPM. I believe the limit is supposedly 4096. So what I did (this was insane too) was just pull the blue wires out of the little headers. And cover them up with little pieces of kapton tape. This way instead of snipping the wires I could reinsert these wires later if I learn how to have fine control over the fans.

Meanwhile, the power supplies run these 6kRPM fans at 4530RPM and i left the 4 in the fan module running at full RPM (it reads 5700 or so) and it is decently quiet. it's not as quiet as I'd hoped it would be but it will definitely do. It is still a bit louder and of a less pleasing register than the noise coming out of my workstation but anyhow is no longer a screamer. With this setup and my copper heatsink mod I have 58C steady state temp so I have achieved my goal satisfactorily. This also indicates I could very likely have gotten away without any heatsink mod as the temp shouldnt be unacceptable (probably high 60s) if I had just reduced the airflow. But I was able to reduce the airflow and still get a temp improvement... I may come back and try to get it even more quiet later though...
 
Last edited:
  • Like
Reactions: bryan_v

erock

Member
Jul 19, 2023
84
17
8
Wow thank you all for the knowledge and links!

I received my console cable and after some faffing about to learn that the 9600 baud rate is the correct one, I am in to the console! As a recap, I'm working toward bringing up the SX6036 I just got!

So far:

Code:
switch-8a36c2 [standalone: master] > show system profile

Profile         : ib
Number of SWIDs : 1
Adaptive Routing: yes

switch-8a36c2 [standalone: master] > show uboot
UBOOT version : U-Boot 2009.01 SX_PPC_M460EX SX_3.2.0330-82 ppc (Dec 20 2012 - 17:53:54)

switch-8a36c2 [standalone: master] > show images

Installed images:
  Partition 1:
    version: PPC_M460EX 3.6.8010 2018-08-20 18:04:16 ppc

  Partition 2:
    version: PPC_M460EX 3.6.5009 2018-01-02 07:42:18 ppc

Last boot partition: 1
Next boot partition: 1

Images available to be installed:
  No image files are available to be installed.

Serve image files via HTTP/HTTPS: no

No image install currently in progress.
Boot manager password is set.

Image signing              : trusted signature always required
Admin require signed images: yes

Settings for next boot only:
  Fallback reboot on configuration failure: yes (default)

switch-8a36c2 [standalone: master] > show asic-version
---------------------------------------------------
Module             Device              Version
---------------------------------------------------
MGMT               SX                  9.4.5070

switch-8a36c2 [standalone: master] > show inventory
-----------------------------------------------------------------------------
Module           Part Number        Serial Number        Asic Rev.    HW Rev.
-----------------------------------------------------------------------------
CHASSIS          MSX6036T-1SFS      MT1645X09712         N/A          AB
MGMT             MSX6036T-1SFS      MT1645X09712         2            AB
FAN              MSX60-FF           MT1642X02818         N/A          A1
PS1              MSX60-PF           MT1643X02475         N/A          A1

switch-8a36c2 [standalone: master] > show protocols

Infiniband:               enabled
sm:                     disabled
router:                 disabled

switch-8a36c2 [standalone: master] > show voltage
------------------------------------------------------------------------------------------------
Module   Power Meter              Reg                    Expected  Actual   Status  High   Low
                                                         Voltage   Voltage          Range  Range
------------------------------------------------------------------------------------------------
MGMT     BOARD_MONITOR            USB 5V                 5.00      5.08     OK      5.75   4.25
MGMT     BOARD_MONITOR            Asic I/0               2.27      2.17     OK      2.61   1.93
MGMT     BOARD_MONITOR            1.8V                   1.80      1.81     OK      2.07   1.53
MGMT     BOARD_MONITOR            SYS 3.3V               3.30      3.31     OK      3.79   2.80
MGMT     BOARD_MONITOR            CPU 0.9V               0.90      0.89     OK      1.10   0.81
MGMT     BOARD_MONITOR            1.2V                   1.20      1.19     OK      1.38   1.02
MGMT     CURR_MONITOR             12V                    12.00     11.70    OK      13.80  10.20
MGMT     CPU_BOARD_MONITOR        2.5V                   2.50      2.48     OK      2.88   2.12
MGMT     CPU_BOARD_MONITOR        SYS 3.3V               3.30      3.34     OK      3.79   2.80
MGMT     CPU_BOARD_MONITOR        SYS 3.3V-SEC           3.30      3.30     OK      3.79   2.80
MGMT     CPU_BOARD_MONITOR        1.8V                   1.80      1.81     OK      2.07   1.53
MGMT     CPU_BOARD_MONITOR        1.2V                   1.20      1.24     OK      1.38   1.02
I guess it's running mlnx_os? And is not an EMC switch? But I already knew that because it's black and blue? Anyway I quite like this switch, it is a well-built piece of hardware. certainly a steal for what I paid. Already got some quieter fans on order. Seems like swapping fans on the PSUs might be slightly scary but shouldn't be a big deal. I loosened one of the fans to confirm the headers are consistent across all 6 fans...

Now scratching my head as to what the next steps will be. I probably need to update the software. But it's also not clear what to do first on the road to enabling Ethernet.
I just got my SX6036 and am having some trouble accessing the console through the serial port on my mobo. I am using a RJ45-DB9 cable I bought from eBay and am wondering if this cable is incompatible. Can you share the specs on the RJ45-USB cable you referenced above so I can give that a try?
 

erock

Member
Jul 19, 2023
84
17
8
I just got my SX6036 and am having some trouble accessing the console through the serial port on my mobo. I am using a RJ45-DB9 cable I bought from eBay and am wondering if this cable is incompatible. Can you share the specs on the RJ45-USB cable you referenced above so I can give that a try?
I got in. Here are the details.
 
  • Like
Reactions: unphased

JonnyB

New Member
Mar 23, 2024
1
0
1
@unphased have you continued testing with reduced airflow on your SX6036? I've also ordered some Arctic 6k fans and am tempted to swap them out
 

unphased

Active Member
Jun 9, 2022
148
26
28
I've been out of town for months but my workstation and network are still running 24/7 at home. I have not been monitoring the switch's temps or anything, have no reason to expect it will change from what I had observed earlier. I would expect that running it with greatly reduced airflow without a heatsink mod would work if the switch does not experience heavy load. I do not know if the powerpc CPU would experience heavy load if the switch experiences heavy load with network traffic, but that is certainly possible.

I have no idea if the particular delta fans I got from the same lineup as the stock fans are knockoffs or what, but they run a lot faster/stronger and louder than the stock fans. so, maybe the stock fans have become weaker over time. Anyway, the Arctic 6k fans are quiet but by no means silent if you're used to quiet PC/workstations. I would suggest planning for installing this switch in a closet somewhere, that would be best. You will also very easily be able to mod huge fans into it if you dont care for preserving its 1U form factor, however i think going that route you need to fake at least one >4krpm PWM signal to keep it from beeping.

It was fun to get in there and mod stuff, but at a certain point as with any hobby you have got to ask yourself is it really worth the effort.
 

i386

Well-Known Member
Mar 18, 2016
4,245
1,546
113
34
Germany
the msx 60xx switches support the fae command which can be used to set the fans speed :D
 

unphased

Active Member
Jun 9, 2022
148
26
28
the msx 60xx switches support the fae command which can be used to set the fans speed :D
Yes. And I have in particular not gone down that rabbit hole yet myself. My understanding was that this approach can allow tuning down to just above the 4000rpm mark, something like that. I will remind myself to play around with that next time I get a hankering for reducing noise, before buying more fans to experiment with.