VSAN vs DRS for Micro PC cluster in homelab?

AveryFreeman

consummate homelabber
Mar 17, 2017
358
44
28
41
Near Seattle
averyfreeman.com
Hey, I'm not really sure what to do in this situation, I've got a handful of Dell 7050 Micro PCs that are very limited as far as connectivity. I'm running vCenter and would like to have some auto deploy + vRealize/vROPs + DPM going with Tanzu TCE, RKE or OKD cluster additionally

I have 3 proper E5 servers with 8 to 16 3.5" 4TB SAS drives each, plus a swath of NVMe, mostly 2280 m.2 consumer, but also a u.2 3.2TB Micron Pro Max 9300, 3-4 800GB Intel S3500s (SATA) and a handful more SATA SSDs of various types (all Samsung, Intel) all of it basically pulls from old laptops and desktops, except the 3.2TB Pro Max (obviously) and a 480GB Samsung PM953 (963?) 21100 m.2 NVMe with PLP I've been using for cache+swap in one host.

Wondering what people think would be best for storage on the Micro PCs:

Providing ISCSI via RDMA+NVME-OF or iSER to the micros with a 10GbE or 40GbE adapter, using the m.2 slot that's usually used for local NVMe with a x4 PCIe riser (so 10/40Gb NICs would be at half lanes, as they are usually x8, but I don't think it's that big of a deal)

OR

Running local m.2 or SATA SSDs for storage on the Micros with DRS for central management of storage resources + allocation

Pros for doing centralized storage: would work better for clustering technologies (potentially), migration of VMs or containers would be a lot faster, could leverage vSAN and container storage frameworks more easily

cons: slower throughput for the Micro PC than a local x4 NVMe drive, harder to set up (ISCSI setup sucks), would require NICs and another switch, could be temperamental and difficult to recover in case of failure, more consequential single point(s) of failure

Pros for local storage: much faster local drive throughput once the VM or container gets migrated, less likely if something takes a shit I won't be able to recover the data, or that it'll take the whole system down

Cons: migration would be super slow with DRS, vSAN basically out of the question with only 1Gbps (I believe (?)) Any container cluster storage is basically out of the question, too, I imagine (please correct me if I'm wrong...)

What would you do if you were me?
 

Sean Ho

seanho.com
Nov 19, 2019
352
141
43
Vancouver, BC
seanho.com
What workload do you plan to run on the 7050s? If something bandwidth-intensive, e.g., ceph with 1 flash OSD per node, then you'd want the 10GbE anyway for cluster traffic. If the goal is something more like traditional k8s web apps with a well-sharded design minimizing cluster traffic, or if it's just for learning, you may be fine with GbE between the nodes. The choice of NIC in turn drives the choice of root drive -- if the nodes only have Gb, then iSCSI root definitely won't be a good time.

Even with 10GbE, I tried using diskless compute nodes (PXE boot, NFS root), but in the end small S3500s or DOMs were cheap enough that I settled on local root.