Yeah took me a few hours, bordering on days of research spread across a few months but I began to realize Mellanox (now nvidia) 40G switching is the sweet spot. just like connectx-3 (and now/soon connectx-4) are the sweet spot for NICs. Windows and Linux capabilities seem solid, and It's been really helped a lot when I discovered recent macOS have built-in mlx5 kernel drivers, since mlx5 is cx4 and up, we do need cx4 for macOS, but it Just Works, off a thunderbolt egpu style enclosure. I have some quirkiness where sometimes the mac to switch link doesnt come up until I do a transceiver reseat at the switch, but there are a lot of variables for me to fiddle with before I get to the bottom of that.
Jumping from 1GbE to 40GbE (25 effective due to typical usage of x4 pcie slots via m.2 adapters and such) for cheap has been pretty spectacular. In terms of capabilities, it singlehandedly extends almost full NVMe storage performance to a whole network (although can't quite touch gen 4 or gen 5 nvme speeds). I also think if your network isnt physically large, the latency is also so good that you could have distributed apps that somehow leverage the DRAM on networked machines, in workloads where you don't need the performance so much but can benefit from reducing wear on SSD endurance.
Anyway, the next hop would be to 100GbE as datacenters move off of that, which, currently and for the foreseeable future, is actually limited to the same 25Gbit or so due to getting stuck on gen 3 signaling over 4 PCIe lanes, got to wait for CX5 prices to drop since that's how you get pcie gen 4 to reach 50/60Gbits or so out of 4 pcie lanes.
I think a sx6036 40Gbit switch *should* potentially be backward compatible with newer stuff since QSFP28 is backwards compatible with QSFP+, so long term i may be able to keep using it to max out 40Gbit bandwidth when running 100G hardware, no idea. Likely 100Gbit switch is on the table though if going through the trouble to reach 100G on the nic side.