Switchless 10GbE Point-to-Point Connection between ESXi Servers [how?]

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

svtkobra7

Active Member
Jan 2, 2017
362
87
28
I bought 2 servers with X9DRH-7TF motherboards and each server has 2 x 10GbE RJ45 NICs. Currently those servers and all other clients are connected to a 1 Gbps switch (Cisco SG-300).

I'm not sure what the term is for a point-to-point, switchless connection between the two servers using 10GbE is called, but I'd like to see if it is possible and if so, how to configure? Unfortunately, my Google fu has let me down.

One caveat that may may this more difficult: (a) pfSense sits on Server A and uses GLAN1 as WAN (direct to ISP) and GLAN2 as LAN (to switch), (b) so in order to allow that direct 10GbE link between the two, maybe I have to use the router on the stick model, i.e. GLAN1 for WAN / LAN? Also, I should mention both ESXi servers are the free version, but if this needs paid VMUG to accomplish, (vCenter Server) that won't be an issue.

Also, as an added benefit, while it will require physical movement of the servers, it will allow me to test the Cat5e drop between the server closet and my workstation in my office to see if anything better than 1Gbps is achievable prior to buying a NIC for that workstation, 10 GbE switch, etc. Unfortunately living in a condo, which concrete overhead, a rewire with 10 GbE compliant cabling wouldn't be impossible, but darn close, and way more effort than I care to put in (and I dismantled the closet to create a server closet).

And now, I introduce the world's worst network diagram ...

Thanks in advance.

I
 

Rand__

Well-Known Member
Mar 6, 2014
6,626
1,767
113
Whats your goal for the 10GBe network?
Quicker vmotion (ie ESX interconnectivity) or VM 2 VM communication? (or both)?

It should be doable via having a separate network (vswitch, own subnet and or vlan) for the 10g connection, then add appropriate vmk's with traffic type to it + add secondary if's to vms (if they need direc outward traffic). Else router everything via pfsense (which is default gw for 10gbe subnets and o/c needs one or more vnics too (one per new subnet)).

I think I referenced Example VMware vNetworking Design w/ 2 x 10GB NICs (IP based or FC/FCoE Storage) (and others on that page) when I built my 10g setup a few years ago. It references dvswitches (so vmug/vcenter) but maybe you can adapt
 

svtkobra7

Active Member
Jan 2, 2017
362
87
28
Whats your goal for the 10GBe network?
  • Maybe to take a step back, the goal in moving from one server to two was to adhere to a 3-2-1 backup strategy, and shoot snapshots via ZFS send / receive nightly. I wanted to have a second on site backup, so I could blow up a server fooling around with it, but not lose data, etc.
  • But then I realized I built a heck of a secondary server, and why not pursue something with more value, and learn something new in the process.
  • Still learning ... and feel free to poke at me (of course) but I'd like to pursue building a cluster with HA and DRS (which definitely requires more than VMware's free offer).
  • So ... that being said ... can the goal be to support the above evolving / dynamic objective?
Realizing that pfSense has inbuilt HA, is what started me thinking down this path, and the fiance complaining when Plex is down (I swear there are some people that are really, really addicted to Plex out there). If I hear the word uptime referenced again, I'm taking the ring back so I can put in a rack of servers (kidding, but that does sound enticing). I digress ...

While a switch would be a "nice to have" (or it may be requisite to accomplish that objective), I feel like it is a bit of a waste as there are only 2 10GbE capable machines ATM. Potentially, and maximally, 3, if the Cat5e run to my office is short enough to support something better than 1 Gbps. Does my point there sorta make sense???

While I'm pretty bad with IT generally, I happen to be worse with networking and every time I tried to set up vlans previously on that simple Cisco SG-300, I ended up locking myself out of the switch. I'm laughing at myself for my ineptitude, so please feel free to join. ;)
 

Rand__

Well-Known Member
Mar 6, 2014
6,626
1,767
113
For HA you will need shared storage which either needs to be zfs/nfs shared (single point of failure) or vsan (2+1 hosts). O/C you can also do Starwind (2 hosts) if you prefer windows and or 2 hosts only.

Funny how everyone starts with the same small goals and (potentially) end up with much to much hw in the end ;)

You could replace that 300 with a 350X that would give you some 10g ports - how many boxes/ports do you have alltogether?
 

svtkobra7

Active Member
Jan 2, 2017
362
87
28
For HA you will need shared storage which either needs to be zfs/nfs shared (single point of failure) or vsan (2+1 hosts). O/C you can also do Starwind (2 hosts) if you prefer windows and or 2 hosts only.
  • So it sounds like I need another server now to eliminate ZFS shared storage being a SPOF?
Funny how everyone starts with the same small goals and (potentially) end up with much to much hw in the end ;)
  • LOL indeed :):):)
You could replace that 300 with a 350X that would give you some 10g ports - how many boxes/ports do you have alltogether?
I had a 350z a long time ago, I shouldn't have sold it ... wait a 350x must be a switch!
  • Server A - SC826 (2 10 GBe + IPMI)
  • Server B - SC826 (2 10 GBe + IPMI)
  • Workstation - If it will do 10 GBe (currently 2 ports, only 1 needed)
  • Other - Not all used - 10 ports
Total: 4 x 10 GbE + 12 x 1 GbE # *

* IPMI is 100 Mbps, but counts as 1 GbE for purposes here.
 

Rand__

Well-Known Member
Mar 6, 2014
6,626
1,767
113
well i wanted to wait a bit until i mention that 4 would be safer than 3 but now that you brought it up...;)

Basically it depends on how much you fiddle with you primary box and how often you need the backup. Me personally I fiddle way too much so I need at least 3 hosts and am looking to got to 4 (to have a spare when i broke one).
If you keep your fingers from the running system (and use the secondary instead) than 2 are fine usually. That might not mean 99.99 uptime but the question is whether you really need that.

so 4 10gbe is the limit you usually can get on a combo switch (24gbe + 2-4 10 gbe), else you need a 10g switch (eg netgear XS708E).

But back to the original question - without spending money you can o/c connect the two servers, put on a new subnet for that, add a vmxnet3 adapter to the zfs boxes within that subnet and you should be able to send with 10gbe (maybe add one to pfsense so you can have a default gw).
You can do shared storage with that also but thats a spof then. Yoiu can use it if you are disciplined (eg use server which is only consuming storage as primary box so fiddling does not impact storage)

Everything more (vmotion etc) might need additional components (vcenter, true shared storage)
 

svtkobra7

Active Member
Jan 2, 2017
362
87
28
well i wanted to wait a bit until i mention that 4 would be safer than 3 but now that you brought it up...;)
  • LMAO.
  • Remember I'm mounting vertically ... so I can only "stack" towards the closet door ... and I'm at 4U (2 x 826) + separate 2U vertical rack for the switch, mounted long side on the Y axis. And I could only move to 6U vertical racks instead of the 4U, as I've never seen anything larger.
  • Maybe I'm thinking about it the wrong way, and I should sell the condo and move to the datacenter, as in me, not the servers?
Basically it depends on how much you fiddle with you primary box and how often you need the backup. Me personally I fiddle way too much so I need at least 3 hosts and am looking to got to 4 (to have a spare when i broke one).
  • Yes, I fiddle entirely too much too.
  • Maybe I stick to the original plan for now ... storage backup, but layer in pfSense HA, and add manual Plex HA, i.e. Plex-A and Plex-B, with my eye on eventually moving to a real HA environment???
If you keep your fingers from the running system (and use the secondary instead) than 2 are fine usually. That might not mean 99.99 uptime but the question is whether you really need that.
  • I'm good with 50% uptime, as long as the other 50% is me fiddling.
  • The Plex addict needs five 9s availability ... I keep asking for the SLA I signed, but she can't produce it.
so 4 10gbe is the limit you usually can get on a combo switch (24gbe + 2-4 10 gbe), else you need a 10g switch (eg netgear XS708E).
  • Good to know info on limit.
  • Problem is, if I buy for today, guess what tomorrow's post will be "can anyone recommend a non-combo switch"?
But back to the original question - without spending money you can o/c connect the two servers, put on a new subnet for that, add a vmxnet3 adapter to the zfs boxes within that subnet and you should be able to send with 10gbe (maybe add one to pfsense so you can have a default gw).
You can do shared storage with that also but thats a spof then. Yoiu can use it if you are disciplined (eg use server which is only consuming storage as primary box so fiddling does not impact storage)
  • So I just need a Storage Kernel (static to say 10.2.0.0/16), vSwitch, and Portgroups on both boxes;
  • I physically connect GLAN2 <=> GLAN2 with a normal Cat6 cable;
  • And they can talk to each other?
Everything more (vmotion etc) might need additional components (vcenter, true shared storage)
  • I'm fine with vCenter Server and I'm just looking for a reason to justify.
  • I'm fine with shared storage being SPOF, but if I'm not fiddling is it still a SPOF?
  • I think the real constraint is just space ...
 

Rand__

Well-Known Member
Mar 6, 2014
6,626
1,767
113
  • So I just need a Storage Kernel (static to say 10.2.0.0/16), vSwitch, and Portgroups on both boxes;
  • I physically connect GLAN2 <=> GLAN2 with a normal Cat6 cable;
  • And they can talk to each other?
I think nfs does not need a storage kernel/traffic type. As long as you only add vnics/switches etc and dont change default gws you should be free to play around. But yes, build physical connection, build logical connection and have fun.
That aside I think i have vmks on all secondary vlans since it makes things easier like setting up default gw, testing (vmkping) [and oc I used different subnets for different traffic types although they all are on the same HCA].

  • I'm fine with vCenter Server and I'm just looking for a reason to justify.
  • I'm fine with shared storage being SPOF, but if I'm not fiddling is it still a SPOF?
  • I think the real constraint is just space ...
How about this then ;) FatTwin Solutions | Twin solutions | Products - Super Micro Computer, Inc. ;)
And o/c its still a spof if you are not fiddleing, just the likelihood of an issue is reduced. But unless you run dual setup on like everything (power, network etc) you always some kind of spof, but you need only cover those with a high failure probability (or fiddeling rate)

Netgear 8 port 10gbe switch (older tech but works ok)
Netgear Prosafe XS708E-100NES 8-port 10Gbit Ethernet Switch | eBay
€300 + p&p used, vat deductible, 30 day doa warranty ...;)
 
Last edited:

ipkpjersi

New Member
Sep 14, 2018
9
4
3
It should be possible to do a 10GbE point-to-point connection. I have connected one of my desktops to an ESXi server via a 10GbE point-to-point connection back when I used to use ESXi (I use Proxmox now).
 

dswartz

Active Member
Jul 14, 2011
610
79
28
Works fine. I've done this to connect a vsphere host with a ZFS storage appliance.
 

svtkobra7

Active Member
Jan 2, 2017
362
87
28
I think nfs does not need a storage kernel/traffic type. As long as you only add vnics/switches etc and dont change default gws you should be free to play around. But yes, build physical connection, build logical connection and have fun.
That aside I think i have vmks on all secondary vlans since it makes things easier like setting up default gw, testing (vmkping) [and oc I used different subnets for different traffic types although they all are on the same HCA].
  • Thanks for the info. I hit a bump in the road which slowed me down, but I couldn't help myself over the weekend ... I finally signed up for VMUG ... only to find out I don't get immediate access ...
  • ... But guess what, I just did ...
  • It looks like 6.7 is not available yet? And what does this mean "You will be able to place an order for this product again in 12 months once you renew your VMUG subscription."?
How about this then ;) FatTwin Solutions | Twin solutions | Products - Super Micro Computer, Inc. ;)
And o/c its still a spof if you are not fiddleing, just the likelihood of an issue is reduced. But unless you run dual setup on like everything (power, network etc) you always some kind of spof, but you need only cover those with a high failure probability (or fiddeling rate)
  • I like the FatTwins - I gave them some consideration, but didn't think they would work well with ZFS for me, i.e. HDD backplane per node, if the node goes down so does the disk access.
  • I got triple WAN + solar & a diesel generator on the balcony = good there! ;) [joking obv]
Netgear 8 port 10gbe switch (older tech but works ok)
Netgear Prosafe XS708E-100NES 8-port 10Gbit Ethernet Switch | eBay
€300 + p&p used, vat deductible, 30 day doa warranty ...;)
  • Thanks!
 

Rand__

Well-Known Member
Mar 6, 2014
6,626
1,767
113
I tend to get the new versions via trial from vmware directly if they are not available on vmug. Usually the keys vmug work nevertheless.
But i think it should be available? Need to check.

The 'available again in 12 months' refers to the key, you can get a key every 12 months only (which expires after 12 months o/c)

Fat Twin and external JBOD?;)
 

svtkobra7

Active Member
Jan 2, 2017
362
87
28
@Rand__ : I should have thought about FatTwin with extra junk in the trunk. Its funny, it seems like when you are trying to decide on a path forward for yourself, it can be tough, but I actually recommended a similar solution to someone else over the weekend (and I know god help whomever I'm giving advice to), but didn't think of that for myself.

I don't see it ...
 

svtkobra7

Active Member
Jan 2, 2017
362
87
28
Unfortunately been following that thread for a bit ...

That pass through map work around doesn't seem to work for me. As to why? Absolutely no clue, but perhaps because I have another NVMe connected and /or passed through to FreeNAS. But I also have no clue, why that work around "works" anyway.

Are you referring to this post => https://forums.servethehome.com/ind...penindiana-2017-10-no-luck.17560/#post-169119

I'm actually booting ESXi off NVMe at the moment ... chuckles ...
 

Rand__

Well-Known Member
Mar 6, 2014
6,626
1,767
113
That thread or the original one where he provided his initial performance test results with proove that using a chunk of optane will work as well as a whole disc opposed to previous nvme drives (eg p3700) which would loose a significant part of their performance with this split setup

And have you tried to find out your actual device id and use that? Maybe you don't have an identical drive type (pcie vs u2 maybe)
How to get Device Identifiers for I/O Devices - VirtualKenneth's Blog - hqVirtual | hire quality
 

svtkobra7

Active Member
Jan 2, 2017
362
87
28
That thread or the original one where he provided his initial performance test results with proove that using a chunk of optane will work as well as a whole disc opposed to previous nvme drives (eg p3700) which would loose a significant part of their performance with this split setup
  • Gotcha ... I misunderstood where you were going with the "chunking" (as in memory management, but that didn't really make sense anyway)
  • P-what ... bah, old tech, I've forgotten about it already (who am I impersonating? ;)).
  • 10-4 on your point.
And have you tried to find out your actual device id and use that? Maybe you don't have an identical drive type (pcie vs u2 maybe)
How to get Device Identifiers for I/O Devices - VirtualKenneth's Blog - hqVirtual | hire quality
  • That was one of the earliest thoughts I had ... checked it (see code tags below) ... the command at that link didn't work oddly.
  • I will tell you what I did get to work ...
Code:
vmkfstools -z /vmfs/devices/disks/t10.NVMe____INTEL_LONG DISK NAME "/vmfs/volumes/NUMBERS & LETTERS/FreeNAS/INTL-900p.vmdk"
  • ... but I know RDM is typically frowned upon, eh?
  • Anyway, I assigned an ESXi NVMe disk controller too it, and it appears to be stable in FreeNAS (as in it has been executing dd for a while, without issue).
  • Hey it seems to work ... but if I loose a pool no big deal ... another box right there ...
Code:
[root@ESXi:~] vmkchdev -l | grep vmhba2
0000:83:00.0 8086:2700 8086:3900 vmkernel vmhba2
[root@ESXi:~] vmkchdev -l | grep vmhba4
0000:81:00.0 8086:2700 8086:3900 vmkernel vmhba4