Recommendations for a k8s host to rival £1k/mo AWS?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

stanbsky

New Member
Feb 3, 2024
3
1
3
Hey folks, thought I might get a better orientation on what's on offer on the market by directly asking vs trying to comprehensively go over all recent STH reviews :)

TLDR: looking for something self-hostable (from a power/noise angle) that I could pitch to my boss to replace my £1k/month AWS bill

Basically, in my role, I work with clients to help them integrate/help with issues with the self-managed version of our service, which is effectively a bunch of stuff in a k8s cluster. Having an approximately like-for-like repro cluster really helps with that, so to that end, I have a continuously running set of clusters, averaging somewhere in the 20c/72Gi resource range, bursting up from that periodically should I need to launch a beefy DB or do some load testing.

I've been running a homelab for years already, made up of 3 humble 4c/16Gi microPCs.... And the thought hit me: maybe I could just sell the higher-ups on the idea that they should just comp me some hardware equivalent to 1-2 months spend, and then the only times I'd reach for AWS would be if I need something very specific or resource heavy (I probably won't manage to self-host my current high water mark of 224c/704Gi !)

I've looked into micros by Minisforum/GMKtek/Beelink, and I think they'd fit the bill. Little worried that they might look "too unserious" for biz, so was also considering something like ThinkStation P5 - but properly specced up, it ain't cheap, and it also the power figures don't look too good (maybe they'd comp my energy bill too? :p )

So yeah, I was hoping that the community could pitch in and help a homelabber out - I'm sure there's other competing options that I've missed!

Thanks a ton!
 

Blinky 42

Active Member
Aug 6, 2015
625
239
43
50
PA, USA
Do you just need x*100 cores across y VMs to test and it being 2-3 generations of CPU behind what you have in AWS isn't an issue? Is a small pool of SATA SSD's enough to handle your whole IO load for the test for all the VMs?
If your types of loads are more getting the VMs all interconnected than running a lot of CPU/Memory/Disk intensive tasks and it is fine to oversubscribe your physical CPU cores by 6-8x then it can be entertaining to run the #s to see if you can find a solution.

Don't forget:
- where are you hosting it? (In a commercial colo, office server room, empty desk in the office?)
- bandwidth and power costs
- 24x7 availability?
- internal only or do your customers need to access it?
- who gets drafted to manage and fix problems with the hardware?

If your aws costs are only $1k/mo then off-hand I would guess you are doing pretty well and getting a flexible on-demand use of resources and not throwing away a lot of $ when it isn't "in use". We do similar spinning up hardware that will burn through a few K in a day but only need that a few times a quarter to do large tests. It is still cheaper to do that on demand than spend $125k for enough hardware to run a similar scale test in one of our medium colos.

You might get better traction if it was pitched as being a small shared "dev" environment if you want to experiment at a scale above what you can run on current local hardware today but not spin up things in AWS.
 

stanbsky

New Member
Feb 3, 2024
3
1
3
So the urgency of the matter and the relative spend vs other company costs like CI actually dictated the solution here: I ended up convincing the leadership to fund me a top-of-the-line Hetzner server! Honestly, I didn't even think of it at first because ~£500/mo didn't seem like juice worth the squeeze of giving up AWS elasticity, but the differentiator was that Rancher, which I'm obligated to host to repro this customer's setup, isn't a big fan of smaller right-sized nodes due to all the cattle-* cruft it inserts.

So here it is, an AX162-S with maxed out RAM modules, in all its glory:
1759524864152.png

Protip: if you do this in a work setting, throw in a decent sized subnet alongside the order. I thought, "why would I need this if I can just run HAProxy?" (hidden under LXC), but the faff of config for that and iptables DNAT especially, is really not worth it over adding another ~50/mo to the bill.
 

stanbsky

New Member
Feb 3, 2024
3
1
3
Do you just need x*100 cores across y VMs to test and it being 2-3 generations of CPU behind what you have in AWS isn't an issue? Is a small pool of SATA SSD's enough to handle your whole IO load for the test for all the VMs?
If your types of loads are more getting the VMs all interconnected than running a lot of CPU/Memory/Disk intensive tasks and it is fine to oversubscribe your physical CPU cores by 6-8x then it can be entertaining to run the #s to see if you can find a solution.
Forgot to answer this - it's the least I can do to thank you for contributing your thoughts :)

I am in a, very broadly speaking, storage/networking startup that has just found PMF and got flooded with interest from large enterprise clients. So the past approach of "swallow the infra & optimisation costs to acquire customers" no longer flies as some of them will be self-hosting/BYOC.

More concretely, cores are less of a priority, but accommodating the huge RAM usage resultant from the new huge workloads, in order to identify the biggest pain points and optimise them, is what the goal is.

And I'm still toying with getting something truly onprem that I can run in my own home... This is a technology flexible enough that I can find my own personal use cases to let me dog food it, and through that, boost both my own understanding and hopefully expose some limitations regular QE testing wouldn't.
 
  • Like
Reactions: Blinky 42