Has anyone worked with NICs in switchdev mode, Open vSwitch offload, SR-IOV and vDPA?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

NablaSquaredG

Layer 1 Magician
Aug 17, 2020
1,345
820
113
Hey folks,

I recently fell into the rabbit hole of NICs in switchdev mode, Open vSwitch offload, SR-IOV and vDPA and I'd like discuss whether my conclusion are correct, as the existing documentation is slim at best.

The TL;DR is:
Conventional VirtIO is slow, SR-IOV is fast
But simple SR-IOV (like in ConnectX-3) breaks stuff like virtual switches. This is bad.

To solve this, we get programmable NICs. This seems to be supported on many modern NICs (Mellanox ConnectX-4 and newer, Intel X710 / E810, Broadcom P2100, etc), with varying degrees of features.
As a baseline, we get an embedded switch inside the NIC that loops back VM-VM traffic on the same host, a so-called eswitch.

More modern NICs have more features, they become programmable.

Particularly advanced Open vSwitch offload seems to be supported on ConnectX-6 Dx, Intel X710 / E810, Broadcom P2100, etc.
The key to this appears to be that the NIC is switched from standard legacy into switchdev mode.

On Mellanox, there seem to be two variants: ASAP² Flex and ASAP² direct.
ASAP² flex is VirtIO with some hardware offload to the NIC eswitch (ConnectX-4 and newer), whereas ASAP² direct is SR-IOV with full offload (ConnectX-5 and newer)

However, for most vendors this only seems to work with SR-IOV. That is bad, because it breaks live-migration (the target host might have a different NIC).

So until recently we had two options:
- VirtIO which is slow, but supports live migration
- SR-IOV which is fast, but does not support live migration (and it's scalability is limited, which might be an issue if we not only want to make VM networking fast, but also container networking)


Mellanox thought: Hey, we can do better. What if we make VirtIO fast?

And then vDPA was born. vDPA = virtio Data Path Acceleration

Essentially, hardware accelerated VirtIO



And then there's still DPDK somewhere, which I haven't quite understood where and when exactly



So, question for the experts: Is my summary / conclusion correct? Anything important I missed?
 
  • Like
Reactions: rtech