Harvester HCI, anyone working with it?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Greg_E

Active Member
Oct 10, 2024
389
136
43
After the difficulty of trying to get a Hyper-V cluster set up, I gave up, bought some gold for my hosts (new SSD for the OS, previous wasn't big enough), and I've started down the road with Harvester. I have a couple books that I'm going through, and found some really good online guides as well. I think I got the first host up a few minutes ago, I'll work on the other two hosts tomorrow.

I was going to try Nutanix next, but then looked into Harvester a bit deeper. Free Forever from Suse, ability to natively handle containers and VMs, truly an open source system... Seems like time better spent on this than Nutanix, and Hyper-V with clusters is just complex (or I'm not grasping something simple).

One thing I like right off the start, building a cluster is simple, done right in the installer in step number 1. Create a cluster or join a cluster.

Downside, it's best to have several network connections for each host, and at least one fast connection for the HCI storage. And best to have 3 hosts, but apparently you can set it up on a single host to get some experience with some aspects.

It does look like it will function on my little HP T740 (4c/8t), filled out with 64GB of ram and a 1TB nvme for storage (256gb for boot), dual port 10gbps card, and an a+e 2.5gbps card. It seems like it might allow using the built in 1gbps Realtek, I may try this to see how well it does or doesn't work. vSphere 8 was working fine on this same cluster until my license expired, didn't gain enough experience to pass the exam so no license renewal (thanks Broadcom :mad: ).
 

Greg_E

Active Member
Oct 10, 2024
389
136
43
In about an hour, you could have a small cluster built and ready to configure, that's not bad. I think this is less time than XCP-ng takes, but XCP would be quicker to finish depending on the configuration. But so far liking the interface and now I'm ready to continue the learning.

Harvester1.png

Harvester2.png

Two books I'm reading for those that might want to play along:

Harvester In Production: Enterprise HCI Operations And Management by Zinnia Harris (kindle version was fairly cheap)
Mastering Suse Harvester: The VMware Alternative by Cassian Smith (again kindle version)

Also to be read: Mastering Suse Rancher by Caleb Tanaka (again kindle version)

Mastering Suse Harvester has a good chapter on is Harvester the product you really should be running and compares to VMware and Nutanix, a refreshing point of view because not everyone needs Harvester or wants the overhead of Kubernetes. All of the above books are fairly recent, which is important since Harvester is growing rapidly.

A website with some good tutorials is:
I'm slowly going through this series, but the install helped me get a few things "right" to get my cluster up quickly.
 

Greg_E

Active Member
Oct 10, 2024
389
136
43
Looks like I need to set up a Rancher node to fully manage this. It seems the Harvester control point is kind of akin to the ESXi management page, but will spill out to the entire cluster. but not all features are there, or you may need to code them in a yaml file. Need to learn a lot more as this is getting more complex. I might look into how rancher behaves, it might be quicker to make a Kubernetes/Rancher cluster and deploy Harvester on top of it. Need to do more reading, much more reading.

Talking to a friend that is working on this for his job, eventually they want to move off of vmware, but right now he is just trying to get their containers up on Kubernetes with Rancher.
 

Greg_E

Active Member
Oct 10, 2024
389
136
43
There is a not suggested for enterprise way of installing Rancher in docker, it can then maybe be migrated onto the Harvested cluster. Going to try it in docker and see what the impact might be on my little hosts, but for my 3 host cluster I would hope it would have very little impact.
 

Greg_E

Active Member
Oct 10, 2024
389
136
43
I haven't had much time lately, only thing I got done was to set up a single VM network on my 2.5g connections (at gigabit). Getting this set and a higher speed VM network set are an important step before making any VMs.

Looks like the computer I am going to try to use for Rancher on docker is going to be delayed not sure how long it will take me to get it integrated now.
 

Greg_E

Active Member
Oct 10, 2024
389
136
43
Still plodding along as time permits. Still also can't quite wrap my head around this:

Rancher requires a Kubernetes "cluster" to be installed, but Harvester is running on a Kubernetes cluster. So why can't you just install Rancher directly on your Harvester cluster? Also why did they kill Rancher OS as an easy way to get Rancher Manager up to control your other stuff.

New to me HP T740 is here, vacuumed the dust out, and Debian 13 installed. All I had time to do, and seems like a lot of work just to get access control working on my Harvester cluster. Wish I could find a cheap source for a few HP T755, the extra 2 cores would be useful here.
 

Greg_E

Active Member
Oct 10, 2024
389
136
43
Spent more time reading (finally), should definitely start with the Mastering Harvester by Cassian Smith book. It's based on Harvester 1.6, but so far everything has translated to 1.7. I'm only up to chapter 5 but feel that for people like me, this is the book to start on (see below).

And really this is a very backwards approach! The order that you really should learn this stuff is Kubernetes --> Rancher --> Harvester. Since Harvester runs on Kubernetes with kubevirt (et. al.), knowing name spaces and how Kubernetes functions would make this easier. Rancher is a fuller function gui manager that allows user management and other important aspects in a blended VM and Container system. That said, the Harvester gui obfuscates this to a point that you can definitely set up a system for VMs without knowing the underlying concepts and get real work in motion. Not as simple to get running as XCP-ng, but not exactly far from it at all, and hyper converged open source which can be important. Many VMware had vSan running, tweak things to primarily ssd or nvme, and your old hosts are ready for the different method.

vSan (traditional) used ssd for short term cache, and allowed spinning storage for long term and bulk sized storage, Longhorn (storage layer in Harvester) does not seem to have this. Longhorn seems to want to run more like vSan Express which is the newer nvme method for vmware. They do say that arrays of spinning disks might be fast enough, but the system typically likes jbod, do some testing before deployment to see what your speed tolerance might be.

And if you made it this far, throw me a thumbs up. That way I know a few people are at least enjoying watching me stumble around in a backwards fashion. Wish I had more time to study this topic, but it will need to happen when it can happen. If I were getting paid to figure this out, things would be very different.
 
  • Like
Reactions: Marsh and wifiholic

wifiholic

Member
Mar 27, 2018
59
59
18
39
I started using Harvester at home back around version 1.3(ish). I did take a detour into the world of XCP-ng, but have been contemplating switching back.

It's great to see other people sharing their appreciation for this unique hypervisor. Thanks for sharing your journey!
 

Greg_E

Active Member
Oct 10, 2024
389
136
43
I'm going to be installing the preview XCP-ng version 9 as soon as I can get some time. I try to keep XCP running for those times I just need to get something done, I run it in production too.

[edit] March 2 2026 this installed failed with no network cards, using Supermicro x520 based cards, XCP-ng 8.3 works fine as usual when I loaded it back in.
 
Last edited:

wifiholic

Member
Mar 27, 2018
59
59
18
39
I'm going to be installing the preview XCP-ng version 9 as soon as I can get some time. I try to keep XCP running for those times I just need to get something done, I run it in production too.
My employer switched from VMware to XCP-ng, hence my current dogfooding of the latter at home. Mind you, they use it with a traditional SAN, while at home, lacking a SAN, I decided to be adventurous and try it with XOSTOR. Let's just say that that's not a choice I'd repeat. On the other hand, Harvester's HCI feels less fragile to me, but I digress.
 

Greg_E

Active Member
Oct 10, 2024
389
136
43
I have not tried drbd/linbit as it's not available in XO source yet. But on an NFS share, things are decent (hopefully better in version 9 with the new kernel). I will say that using an HP T740 with sata for OS and nvme for storage, with an x520-da2 and Truenas Scale, I got good nfs performance with that single drive in my lab. vSphere 8 worked fine over nfs, never did get vSan running on it before the license was gone. Hoping to get xcp v9 installed tomorrow and nfs option nconnect=8, this is what Microsoft suggests, ESXi defaulted to nconnect=4, and xcp v8.3 seems to only be a single path no matter what you type in the option.

So far Harvester seems pretty happy with the single nvme in each host, I need to get a VM loaded and run the disk speed checks that I've been using to see what happens, especially what happens to the other hosts as the test data gets written to all three. Been so busy there isn't much time to test. I'm kind of expecting the write portion to fall down hard since it is writing to all three disks at once. Read speeds should be max since it is essentially a local disk, the T740 has a PCIe 3.0 x4 on the nvme slot, and I have gen4 drives installed that have shown to be really fast in local testing.

Time, I need more of it, seems more expensive than ram right now.
 

Greg_E

Active Member
Oct 10, 2024
389
136
43
Wrecking my domain controller to put in a new n100 powered mini-pc. That's as far as I've gotten. Been on the phone for over an hour sorting out a very expensive prescription, so I was getting some things done while on hold. Just talked to an AI assistant... Why are they so condescending?
 

Greg_E

Active Member
Oct 10, 2024
389
136
43
Got a little work done on this, got a Server 2022 VM set up. Having a problem with Rancher on Docker, I'll work on this later.

Did a speed test on the storage and have some comparisons. Looks like I need to check the driver installed and maybe find some tweaks, it's a lot slower than I expected.

Here is the same host running esxi saving to the local nvme:
local-esxi.png

Now here is Harvester, again same hardware
harvester_nvme.png

That's a pretty big change for both of them writing to the local nvme. Only a few times did the management network (also storage network) hit close to 10gbps.

For contrast, here is esxi writing out to my NFS NAS:
NAS2-nfs-esxi.png

And here is XCP-ng on similar hardware writing out to NFS on the same NAS:
NAS2-nfs-xcp.png

Both nfs are truncated because it was clear that no increases would come from larger tests, the NAS was set to handle 2m as the largest. Esxi uses nconnect=4 as default, XCP-ng use a single connection (for now).

And finally, here is a screen shot of the dashboard during these tests (thumbnail this time, so click on it):
Screenshot 2026-03-02 204201.png
 
Last edited:

Greg_E

Active Member
Oct 10, 2024
389
136
43
I set Longhorn to use best-effort on local disk to see if performance would increase, reads went down over all.

I think I may wipe all this and try enabling Longhorn v2 with the nvme specific settings and see what happens. It takes more ram and another core, using more cores on these little hosts can be an issue. Need to read up on it more and see what the implications are going to be. Big warning on the V2 pages about this being experimental and not for production use.

[edit] I see version 1.7.1 is out and since I've messed things up, going to start from scratch and start with Longhorn v2 this time. Probably be a few days before I get back to testing the disk speeds.
 
Last edited:

Greg_E

Active Member
Oct 10, 2024
389
136
43
I should probably explain why I even care about "drive" speeds. The majority of my production VMs are Windows, and Windows Updates kind of suck. When you have really slow throughput in the less than 16k sizes, updates really suck. When I had a vSphere license, the updates were definitely less painful than on XCP-ng to the same NAS (different shares but same pool of drives) with the same model and config hardware. Why? The only reason I can provide is newer kernel with newer NFS options like the above mentioned nconnect=XX, esxi uses a value of 4 simultaneous connections, XCP-ng 8.x is an old kernel and old NFS so it only has a single connection.

Looking at the speeds from Longhorn v1, I can see possible Windows Update frustration that I want to try to optimize, those sub 16k speeds were rather horrible.

You can check this thread to see the hardware I'm working with https://forums.servethehome.com/index.php?threads/my-lab-an-ongoing-project.54462/

All hypervisors are now 64gb, dual x520 cards, single i226v through a+e slot, 256GB SATA m.2 drive for OS. The HV1, HV2, and HV3 have the additional 1tb nvme m.2 drive installed. I really should have bought more and bigger drives back when I was building these!!! 2tb x6 looked like a really large amount of money back then, now...
 

Greg_E

Active Member
Oct 10, 2024
389
136
43
Ran into an obstacle... There is nowhere to specify which version of Longhorn you want when you are installing from the text gui. Time to wipe it and start again (maybe). I think I need to set either none for data disks (if allowed) or set it to the OS disk and let Longhorn v1 set up there. After that you can get in through the GUI and turn v2 on, and then add a disk to v2 and set it for the default storage.

Would be nice if there was a GUI function to either migrate the disk to v2, or delete v1 and create with v2.

Going to play with it a bit more before wiping the first node out (again) and see what I can see.
 
  • Like
Reactions: wifiholic

Greg_E

Active Member
Oct 10, 2024
389
136
43
No dice. Can't use Longhorn V2 in the installer, and "removing" the only data drive to "replace" with v2 isn't exactly a simple task. The OS drive is only 256gb and not big enough to share with data drive functions. Looking at m.2 SATA drive prices makes me scream, so I guess it is going to be v1 for a while.

But the cluster is back together, need to configure networks again and read up on adding users to the admin UI, I think I found it.
 
  • Like
Reactions: itronin

wifiholic

Member
Mar 27, 2018
59
59
18
39
Yep, after seeing you mention Longhorn V2, I did some reading, decided to give it a try, and ran into the same issue. Well, I didn't try reinstalling, but realized that I wouldn't be able to convert my existing data disk, and decided that I could be content with V1 (it was fine before, but I also run mostly Linux VMs, so Windows Updates aren't the concern for me that they are for you).
 

Greg_E

Active Member
Oct 10, 2024
389
136
43
Think I'm going to try Suse Linux micro for my Rancher machine, I couldn't get rancher to run in Debian 13, container was up but constantly restarting. Hopefully not a lack of ram issue, no more kidneys to sell.

Priced out 500gb sata drives, up to $95+, 1tb were just under $150 for same brand and model. I'd only need 2 of the 1tb because I have one from a few years ago, but $300 doesn't make a lot of sense for storage that isn't fully functional and not recommended for production.

And when I was working on things, I forgot to plug the VM network cables back in, can't assign them as uplinks unless they are up.
 

wifiholic

Member
Mar 27, 2018
59
59
18
39
Having just finished converting the handful of VMs that would be too much of a pain to re-setup from scratch, my current take is that Harvester's VM import controller plugin is too much of a pain for dealing with my three- or four-off needs.

Instead, I simply powered off the VMs in Xen Orchestra, downloaded the disks as qcow2 images, uploaded the images to Harvester, created new VMs matching the specs of the old ones (making sure to enable EFI boot if it was enabled in XO!), and assigned the corresponding newly uploaded disk image.

Other than having to edit the netplan file due to the network interface name changing, it was pretty much plug and play. While this probably wouldn't work in a professional environment, or for a homelab with dozens of VMs to move, it was an effective path for my modest needs.