Arista 7050QX - Cut-through - Store and Forward - iSCSI

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Stril

Member
Sep 26, 2017
191
12
18
41
Hi!

I need some new switches - exclusively for iSCSI-traffic.

Arista seems to support cut-through, but I am not aware of how important this is:

- Do you think, I have any better iSCSI performance in cut-through-mode because of its lower latency?

- Cut-through seems to be only available, if switching is working from 10GbE to 10 GbE or 40-40GbE, but what happens, if switching has to do: 10GbE - LAG 4x 10GbE - 10 GbE?

Thank you for your help!!
Stril
 

aero

Active Member
Apr 27, 2016
346
86
28
54
Traffic over a LAG is still cut-through.

Traffic from 10g to 40g can never be cut-through on any switching platform... The entire frame must be received below sending or an underrun would occur.

40g to 10g will buffer obviously, but does cut through.

As recommended, infiniband will be lower latency, but with obvious operational tradeoffs.

Cut through helps a lot with latency. I'd personally go with those Arista switches and not infiniband.
 

Tiberizzle

New Member
Mar 23, 2017
25
11
3
124
The big deal with cut through in storage applications is that you should be using jumbo frames in pretty much any storage application. Even if your hardware is capable of receiving at line rate it will significantly reduce the CPU load of processing that flow. That said, it is virtually assured that no matter how nice your hardware is, it is going to struggle to cope with single-queue flows approaching 40Gbps in the traditional "kernel network driver, user mode application" style due to the enormous context switching overhead involved and anything you can do to reduce that (using larger frames) will improve both realized performance and reduce the CPU load.

If you are using jumbo frames with a store and forward switch, that means waiting for a 9000+ byte frame, or about 7.36 microseconds best case at 10Gbps.

Cut-through switches of this class begin to forward at the worst in under a microsecond, usually on the order of around 200 nanoseconds, and in some (very expensive) cases even faster than that.

In a local network at these speeds forwarding and serialization delays are rather significant, anything above 2.5 microseconds for a 40Gbps link would be considered a "long, fat network" in bandwidth delay product terms. So, even the storage traffic will see some gains in efficiency due to the reduced significance of buffering, but in "converged" networks with some workloads that are particularly latency sensitive it can be even more significant.

Cut through will only be utilized between ports of the same speed, as aero notes, but as you suspect you can use 4x10 breakout mode and LACP to get aggregate multistream bandwidth of 40Gbps with cut-through. You will need to use a storage protocol that makes use of multiple flows or have a diverse multi-user access pattern to properly utilize 4x10G breakout in this way.

I would disagree with aero's suggestion to consider Infiniband very strongly. I used Infiniband for almost 10 years (DDR, QDR then FDR) and over that period it was actually supported by a recent vanilla kernel for between 12-16 months. The rest of the time I was either forced to remain downlevel and sacrifice other features or forced to fall back to IPoIB because Mellanox can't write a functional driver or bother to support any of the 3 or so applications that actually use RDMA to save their lives. I even briefly fell back to gigabit ethernet, because of a kernel bug which degraded IPoIB performance to <100KB/sec which ran concurrent with the entire-3.x-series-kernel NFS/RDMA bugs.

I use 2x40GbE to realize FDR Infiniband performance (on throughput, admittedly nothing really tries to compete seriously with RDMA over Infiniband on latency) without overly complicating my stack these days, and I'm going to be slowly replacing my remaining Mellanox Ethernet NICs with literally anything else, because the DPDK and RoCE drivers have the same problems.
 
Last edited:
  • Like
Reactions: zxv and Stephan

Stril

Member
Sep 26, 2017
191
12
18
41
Hi!

Thank you for your answer!!

In the past, I just did not see any better performance in Cut-Through, than in Store-and-forward, but your points are clear.

I will stop using the 40 GbE-only-ports of my Arista 7050QX and only use breakouts to have cut-through on all connections.

What about MLAG? How do you use the ISL in these cases? 8x10 GbE instead of 2x 40 GbE?
 

i386

Well-Known Member
Mar 18, 2016
4,241
1,546
113
34
Germany
In the past, I just did not see any better performance in Cut-Through, than in Store-and-forward, but your points are clear.
How did you benchmark the performance?
What hardware/software (ssds, nics, cpus, os, fs) did you use on the hosts?
 

Stril

Member
Sep 26, 2017
191
12
18
41
Hi!

I did diskspd benchmarks of full-flash starwind iscsi systems with store-and-forward switches and direct links and did not really see different values.
 

Stril

Member
Sep 26, 2017
191
12
18
41
I could not find a config-parameter to configure "latency-mode" (in software 4.18.5M).

The only thing was/is:
Code:
arista(config)#show switch forwarding-mode
Current switching mode:    cut through
Available switching modes: cut through, store and forward
--> but what does this mean? If this is "performance-mode", cut-through is only activated for 40 GbE - right?

The link above shows:
  • Latency mode is equivalent to the command ”switch forwarding-mode cut-through” and it is the default configuration of the switch
But in my case, the "right 8 40 GbE-ports" are active although cut-through-mode is activated...