Which nic or switch for Windows Server and SMB direct/RDMA

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

besterino

Member
Apr 22, 2017
32
13
8
47
Happy to chip in here as well.

In a nutshell: I fail to get smb-direct working between a linux server and windows client.

Hardware is capable, proven with same hardware and between Server 2025 and same client on win11 Pro for workstations. RDMA is checked by utilisation of nic shown in taskmanager ofwindows client: should remain low while copying with explorer (and does so between windows machines).

I in particular fail to understand how to properly check/log/debug RDMA operability.
Windows seems to be simple: get-smbclientnetworkconfiguration shows rdma capability “true”.

Basic functionality seems to be there between Linux and windows: rping works.

But pretty much nothing else…
 
  • Like
Reactions: Gilbert and pimposh

Gilbert

New Member
Aug 11, 2016
23
2
3
66
Happy to chip in here as well.

In a nutshell: I fail to get smb-direct working between a linux server and windows client.

Hardware is capable, proven with same hardware and between Server 2025 and same client on win11 Pro for workstations. RDMA is checked by utilisation of nic shown in taskmanager ofwindows client: should remain low while copying with explorer (and does so between windows machines).

I in particular fail to understand how to properly check/log/debug RDMA operability.
Windows seems to be simple: get-smbclientnetworkconfiguration shows rdma capability “true”.

Basic functionality seems to be there between Linux and windows: rping works.

But pretty much nothing else…
The quest continues, I am in the same boat. Will do some more research on homelab with windows server 2022 and RedHat 9.4 with my Chelsio T580-cr cards. Will post any results I observe good or bad, when I get a chance.
 

besterino

Member
Apr 22, 2017
32
13
8
47
Quick Update: I managed to get RDMA running with/as “NVME over fabric” with this guide (the “idiot” in the title immediatelspoke to me):

https://www.reddit.com/r/truenas/comments/1fh3rfl
It works 1:1 also with Proxmox.




Performance is still far from hardware limits (in fact the x4 PCIe-3 connection for the NIC in the win client), but at least win RDMA performance counters tell that RDMA is active. I assume something between ZFS/zvol/dunno bottlenecking somewhere, but demonstrates that win and Linux can talk RDAM with each other. ;)

So more reason to believe that something’s not working properly with ksmbd for me.
 
Last edited:

Gilbert

New Member
Aug 11, 2016
23
2
3
66
Quick Update: I managed to get RDMA running with/as “NVME over fabric” with this guide (the “idiot” in the title immediatelspoke to me):

https://www.reddit.com/r/truenas/comments/1fh3rfl
It works 1:1 also with Proxmox.




Performance is still far from hardware limits (in fact the x4 PCIe-3 connection for the NIC in the win client), but at least win RDMA performance counters tell that RDMA is active. I assume something between ZFS/zvol/dunno bottlenecking somewhere, but demonstrates that win and Linux can talk RDAM with each other. ;)

So more reason to believe that something’s not working properly with ksmbd for me.
My question would be what storage drives do you have on both ends(e.g. gen 3 or 4 nvme? Nvme on both ends? How are they connected to client and server? You can check to see if they are actually connected at the speed and width they are supposed to be connected at. e.g. gen 3 x4 gen 4 x4 etc.)
 

besterino

Member
Apr 22, 2017
32
13
8
47
Win-Client: PCIe5 NVME. Delivers locally 12000MB/s depending on benchmark.

Server: 4x PCIe3 NVME (through bifurcation x16 HyperX Adaptercard). Deliver individually & locally 3-3400MB/s depending on benchmark.

Just as a comparison, if I use windows on the server I almost get max local speed over the network with RDMA.
 
  • Like
Reactions: Gilbert

Gilbert

New Member
Aug 11, 2016
23
2
3
66
Win-Client: PCIe5 NVME. Delivers locally 12000MB/s depending on benchmark.

Server: 4x PCIe3 NVME (through bifurcation x16 HyperX Adaptercard). Deliver individually & locally 3-3400MB/s depending on benchmark.

Just as a comparison, if I use windows on the server I almost get max local speed over the network with RDMA.
I would check to make sure in Linux that your RDMA Nic is connecting at expected pcie speeds. Here is an example what you see. Circled in yellow is the address of my Mellanox card. What you see in red is the actual connected speed to the pcie bus and width.
Screenshot 2024-12-04 231006.png
 

besterino

Member
Apr 22, 2017
32
13
8
47
I‘ve buried my Linux endeavours for the time being. I’m apparently not experienced enough to get things up and running the way I want it in a time frame that makes sense for me.
I’ll give it another try once I see success stories that fit my ideas, unfortunately I am not able to spearhead them myself. ;)
 

Gilbert

New Member
Aug 11, 2016
23
2
3
66
I‘ve buried my Linux endeavours for the time being. I’m apparently not experienced enough to get things up and running the way I want it in a time frame that makes sense for me.
I’ll give it another try once I see success stories that fit my ideas, unfortunately I am not able to spearhead them myself. ;)
Understood, I am following your guide and will post results as soon as I can. The idiot in the title spoke to me too, lol.
 
  • Like
Reactions: besterino

Gilbert

New Member
Aug 11, 2016
23
2
3
66
As far as I understand, after experimenting myself and reading this thread How Can I Help with the new TRUENAS / 100G testing?

Windows to Linux SMB-Direct with ksmbd doesn't work

Linux to Linux is fine
Sorry I did not respond earlier its been a few months but I got it to talk rdma(roce)(Windows server 2025 client Linux Debian, but the results as far as thruput, were poor. So I dropped it, and made do with my 10Gbe network. I also played around with NVMEof(like Iscsi but with rdma instead), that was better but still not production ready.
 

bugacha

Active Member
Sep 21, 2024
232
55
28
Sorry I did not respond earlier its been a few months but I got it to talk rdma(roce)(Windows server 2025 client Linux Debian, but the results as far as thruput, were poor. So I dropped it, and made do with my 10Gbe network. I also played around with NVMEof(like Iscsi but with rdma instead), that was better but still not production ready.
What speed do you get over 10gbe between Windows server client and Linux debian ?
 

Gilbert

New Member
Aug 11, 2016
23
2
3
66
What speed do you get over 10gbe between Windows server client and Linux debian ?
Around 600-700MBS transferring a 250GB File Consisting of 10 chapters of a 4K bluray with spinning rust(12drives in Raid6)-(12drives in Raid10) , and about 900-1100MBS using Nvme to Nvme.
 

Gilbert

New Member
Aug 11, 2016
23
2
3
66
I have 3 servers, in one server it is all flash 12 mixed samsung 883dct and Toshiba Sata SSD. In another Server I have 7 nvme drives on the pci bus as well as 12 6tb drives. With hardware Raid Card Adaptec series 8. The thing that I have found is that if you combine say all the ssd's are even all the nvme drives in a raid 0, you are limited to about 1 to 2 drives performance, this is even disk to disk on the same computer, with 86 ver 4.0 pci-e lanes. I am convinced that the manufactures have crippled the firmware in the newer equipment, to eliminate being able to aggregate cheap comparable drives, in order to maintain the profit margins of those 100,000 dollar storage arrays. I say this based off my experience. In 2012 with a machine that was pcie-2 and 16-1 tb sata 2 hard drives and a connect x (20Gbe card) in a raid 0 array, I could transfer 14 tb of data in 4 hours to the server or from the server. And as I added disk my thruput increased. And this was with consumer hardware. Fast forward to about 2016 to now, with much better equipment and know how, I have not been able to duplicate that level of performance whether with ssd, 24 hard drives, or nvme. Because NVme will not scale in a raid 0 configuration. It takes me about 9-10 hours to transfer 19tb of data server to server with mechanical drives. Only in benchmarks can I see the thruput. So I been chasing a ghost for years, is my assesment.