To VLAN or not, Frustration!

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Myth

Member
Feb 27, 2018
148
7
18
Los Angeles
Hey guys,

so great to have a forum where I can ask technical questions to lot so eyes. So I'm a SAN administrator, and we get the best performance when we manually load balance each of our ports on different networks.

For example I have a server with 4 ports of 10GigE going into a switch. We used to use LACP but it doesn't increase the bandwidth at all, and only provides redundancy, so it sucks. Plus it seems to hang up our server sometimes, and other times it crashes the NIC teaming and brings down the network. I think it does this because we are an ultra fast SSD Media server streaming content to lots of users simelataneously.

So the best option I think is to manually load balance, but our SAN software requires each port be on it's own subnet. I think because if all 4 ports are on the same network SMB 3.0 and 3.1 direct responds out of all 3 ports into the switch which screws up our bandwidth.

Is it possible to connect 4x ports into a switch, each port on it's own network, and then manually assign 40 or so client workstations an ip address 10 per each network.

For example:
I would hook up the switch to the four ports on the back of the server. I'd assign each port an IP addres something like:
192.168.10.100/24
192.168.20.100/24
192.168.30.100/24
192.168.40.100/24

Then for the first ten workstations I would give them a 192.168.10.xx ip address, then the next ten I would give them a 192.168.20.xx IP address, etc.

After all clients have been manually configured I know it will be possible for them to access and see the port of their specific network without a VLAN. So all four networks will be on one switch. Each communication with the other successfully, so why do I have to VLAN them into physical or logical ports on the switch?

Is it because of troubleshooting? And network segmentation? But wouldn't the logical network segment the broadcast out, or would they all be jumbled up? Do I really have to create four VLANs if I want to manually load balance my server in my production network?

Best,
Mythg
 

cesmith9999

Well-Known Member
Mar 26, 2013
1,417
468
83
Are you running a windows based server? and which SAN software?
Are the clients Windows (10?) or Linux?

If you are running current Windows, I would not manually load balance the clients to ports on the server. this means that you lose any failover functionality.

Do you have a router in your setup? if so I would have a separate VLAN for all clients and 4 for the server. then SMB-Multi-channel could allow you to use all of the bandwidth on all of the ports from all of the clients automatically.

Chris
 

Myth

Member
Feb 27, 2018
148
7
18
Los Angeles
We don't have a router, windows server 2016, with windows and MAC clients.

Also, we need incredible speed so we might sacrifice redundancy since each workstation needs to playback 4k media without dropping a single frame.
 

Myth

Member
Feb 27, 2018
148
7
18
Los Angeles
Are you running a windows based server? and which SAN software?
Are the clients Windows (10?) or Linux?

If you are running current Windows, I would not manually load balance the clients to ports on the server. this means that you lose any failover functionality.

Do you have a router in your setup? if so I would have a separate VLAN for all clients and 4 for the server. then SMB-Multi-channel could allow you to use all of the bandwidth on all of the ports from all of the clients automatically.

Chris
Also are you sure that SMB direct will work over different subnets? I know that it will work because if I set the default gateway to the switch It will see the other three subnets, but I read before that SMB only looks for multiple paths on the same subnet/network. Do you know for sure if this configuration will work?

I think it's pretty exciting to have four different networks on the server one off each port, then plug it into a cisco switch and vlan three of those ports into there own vlan. The fourth port will be on the same network as the other 40 clients. So basically everyone will go through the port on the same subnet unless using SMB direct... is that correct, or if the port on the server goes down, then one of the other ports will handle the traffic? How would that work?

Thanks for helping, it's very interesting!
 

cesmith9999

Well-Known Member
Mar 26, 2013
1,417
468
83
SMB multichannel should work on the same subnet. However I never tried that configuration.

Chris
 

kapone

Well-Known Member
May 23, 2015
1,095
642
113
I'm completely confused. Let's start with the basics.

1. What network speed does each client need?
2. How many clients are there in total?
3. How many will be "in use" simultaneously?

The server connectivity should be sized to the client load.
 

Myth

Member
Feb 27, 2018
148
7
18
Los Angeles
I'm completely confused. Let's start with the basics.

1. What network speed does each client need?
2. How many clients are there in total?
3. How many will be "in use" simultaneously?

The server connectivity should be sized to the client load.
26 clients each client needs 400MBps I would say at least 10-20 will be in use at any given time. The server has 8,000MBps to dish out. I guess I could two 2x 40GigE LACP via IP address in windows server 2016. We have to configure the NIC team via power-shell as the GUI only has the IP hash algorithm which seems to hang up the server when streaming a lot of media.

I just don't trust LACP after it cause this host unmanageable error. It's really odd but under server manager, local server, NIC teaming, it says host unmanageable. But it's just a visual glitch.If I go to all servers from server manager right click on my server then I can go to NIC teaming to access the teaming again. But the first time it "broke" everyone went down.

I was just trying to get more bandwidth for the clients while providing a stable connection. While LACP is redundant it seems to create network bogs at some of our bigger clients. I'm hoping this is because the previous SAN administrator was using the wrong algorithim with this NIC teaming config. He was using the IP address hash method with LACP where as I will try just the IP address config with LACP. I'm thinking the "hash" part was to much for the WMIC on the windows server and it crashed.

LACP teaming also seems to act very strange when we have another server on the same network. Like two of our SAN servers on the same network they can see each other through the NIC team but they should not be communicating and only responding to clients. But we have our own proprietary TCP protocol so I'm guessing this protocol doesn't do so well with two servers in the same network via NIC team, however if I remove the team, everything works fine, and speeds are even a bit improved, it's just not redundant.

-:(
 

kapone

Well-Known Member
May 23, 2015
1,095
642
113
26 clients each client needs 400MBps I would say at least 10-20 will be in use at any given time. The server has 8,000MBps to dish out. <snip>
400MBps is ~4gbps. Is that correct and that's what each client needs?

26 clients in total with max 20 online at any given time = 20x4gbps = 80gbps required on the server?

Sound about right?

My next question is why does each client need 4gbps? I understand your need for 4K video, but 4k requires a fraction of even gigabit, so why 4gbps on the client?
 

Myth

Member
Feb 27, 2018
148
7
18
Los Angeles
400MBps is ~4gbps. Is that correct and that's what each client needs?

26 clients in total with max 20 online at any given time = 20x4gbps = 80gbps required on the server?

Sound about right?

My next question is why does each client need 4gbps? I understand your need for 4K video, but 4k requires a fraction of even gigabit, so why 4gbps on the client?
Production SAN servers edit the color data on RAW media high def, they call it online editing. It pulls around 230MBps depending on the codec used. I doubt they will pull that much. It looks like they are only pulling around 700MBps right now, with 20 of them connected via 1GigE. But they will start working on 4K projects.

So we are going to sell them an SSD server and a 10GigE switch. I want to give them all a lot of bandwidth.
 

kapone

Well-Known Member
May 23, 2015
1,095
642
113
Production SAN servers edit the color data on RAW media high def, they call it online editing. It pulls around 230MBps depending on the codec used. I doubt they will pull that much. It looks like they are only pulling around 700MBps right now, with 20 of them connected via 1GigE. But they will start working on 4K projects.

So we are going to sell them an SSD server and a 10GigE switch. I want to give them all a lot of bandwidth.
If you're working with uncompressed 4K...:)

I'd play it safe and upgrade your networking like you said. 10gb to the clients, but on the server...I'd do away with LACP altogether and move to a 100gb backbone. It's not THAT expensive.
 
  • Like
Reactions: Tha_14 and Myth

Myth

Member
Feb 27, 2018
148
7
18
Los Angeles
If you're working with uncompressed 4K...:)

I'd play it safe and upgrade your networking like you said. 10gb to the clients, but on the server...I'd do away with LACP altogether and move to a 100gb backbone. It's not THAT expensive.
Yeah it kind of is... clients won't go for that I already know. But yeah that would be the best option. Another really great option would be if our developers used SMB Multi-channeling instead of LACP.

Then if a 10GigE client was studdering during playback we could just hook up another 10GigE port to the client workstation on the same subnet, and then it would auto RDMA and pull the data across two different ports. But alais our developers use proprietary TCP protocal but I think SMB might be better because of multichannelling. I know it's faster to backup to a direct connect server via 2x 40GigE because it gets 61,000MBps. But single 40GigE or LACP only get 33-40MBPS

It's also possible to manually load balance the switch, one 40GigE for 13 clients, then the other 40GigE for the other 13 clients. But we would have to have two VLANS on the switch and it's be a headache.

I also think it's an interesting idea to VLAN four ports on the switch, one for each 40GigE going to the server. Then one big Vlan for all 26+ clients. Add a default gateway into the layer 3 switch, and then run most of the traffic that way, with manual load balancing across the switch going through the layer 3 gateway. It will put a huge load on the switch, so not sure if it can handle it.

It sucks our software has to have each port on a different subnet or LACP which I don't like at all. I wonder if I can contact Microsoft to help me with this LACP bullshit.
 

arglebargle

H̸̖̅ȩ̸̐l̷̦͋l̴̰̈ỏ̶̱ ̸̢͋W̵͖̌ò̴͚r̴͇̀l̵̼͗d̷͕̈
Jul 15, 2018
657
244
43
Out of curiosity, how are the existing server nics cooled? You might check their temps during operation sometime; I found adding a small fan to one of my Mellanox 40GbE cards fixed problems I was having with instability under load. Oddly I have another identical card sitting in a Linux machine that's stable up to ridiculous temperatures, I only had temperature instability problems on windows.
 

arglebargle

H̸̖̅ȩ̸̐l̷̦͋l̴̰̈ỏ̶̱ ̸̢͋W̵͖̌ò̴͚r̴͇̀l̵̼͗d̷͕̈
Jul 15, 2018
657
244
43
In a server chassis + high static pressure fans or a "normal" case?
In a "normal" mid-tower case. The other machine is built in a small Fractal Design Node 304 case and the identical network card gets less airflow but is stable about 15C out of operating range.