Recommendations for Small Buisness configuration.

jamesthetechie · Apr 23, 2024

Hello All, i recently made the dumb decision to start my own company and self fund the entire thing.

All i do is host docker containers (not kubernetes). my current hardware config is as follows:
1 C6400 chassis with 4 nodes of the following specs per:
2x intel xeon 6150
512gb of DDR4 ram
HBA 330
2x 960gb sata ssd
2x 1.9tb sas SSD
1x 3.4tb sas SSD
1x integrated i350 1gb nic (single port)
1x x540 intel 10gb nic (single port)
1x cx-4 EDR nic (dual port)

My question for you fellow tech enthusiasts is what would you recommend for creating a solution that has virtualization HA and storage based redundancy?

currently I'm looking at going with Ceph and proxmox to create a 4 node redundant cluster. my concern with that is the lack of RDMA support with ceph and the performance hit id take with running Ceph so i wanted to know if theres a better solution out there.

unfortunately my budget is $0 at this time, please let me know what your thoughts are and what has worked for you and any recommendations you may have.

alaricljs · Apr 23, 2024

The lowest price option I can think of (that's more performance than ceph) is still going to cost money...

Home describes how to build your own ZFS cluster w/ dual ported SAS devices sitting in an external disk shelf. You can then do active/active w/ 2 nodes and failover everything to a surviving node.

CyklonDX · Apr 23, 2024

if you plan on doing docker HA and still use more than half your nodes - you need to go kubernetes if you want to be "free"; Else if you are ok loosing 2 nodes, you can use linux pacemaker or similar and just run mirrored nodes. (would need rhel *nix)

jamesthetechie · Apr 23, 2024

alaricljs said:
The lowest price option I can think of (that's more performance than ceph) is still going to cost money...

Home describes how to build your own ZFS cluster w/ dual ported SAS devices sitting in an external disk shelf. You can then do active/active w/ 2 nodes and failover everything to a surviving node.

Oh wow, that is an excellently written guide, ill for sure be bookmarking that one.

jamesthetechie · Apr 23, 2024

CyklonDX said:
if you plan on doing docker HA and still use more than half your nodes - you need to go kubernetes if you want to be "free"; Else if you are ok loosing 2 nodes, you can use linux pacemaker or similar and just run mirrored nodes. (would need rhel *nix)

i wont be doing Docker HA, just more need HA at the VM and storage level, i looked at that and i can only sustain about one node loss at this time.

CyklonDX · Apr 23, 2024

I'd say cheapest solution would be proxmox, else you can try oVirt but there's less usage - and cert expiring can nuke your setup.

jamesthetechie · Apr 23, 2024

CyklonDX said:
I'd say cheapest solution would be proxmox, else you can try oVirt but there's less usage - and cert expiring can nuke your setup.

Thank you for the suggestion! i cant believe i haven't heard of this one before, ill definitely be trying this out later. If theres a risk of it nuking my setup ill avoid using that, but it looks like a fun thing to play around with.

CyklonDX · Apr 23, 2024

oVirt is definitely more professional especially since its running on rhel nix vs proxmox that works on debilan; but as long as you keep it monitored, and handle stuff in respective amount of time there's nothing to worry.

alaricljs · Apr 24, 2024

Isn't Proxmox HA just Ceph or shared external storage?

i386 · Apr 24, 2024

jamesthetechie said:
1 C6400 chassis with 4 nodes of the following specs per:

There was a front page post about a 2u4node system failure a few years ago where a power surge fried the nodes at the same. I think sth was down for a few hours/days.

@Patrick do you have a link to that post? I rember it but can't find it with the forum search or google (oldest post in main sites forum is from 2016...)

jamesthetechie · Apr 24, 2024

I'm actually not concerned about the hardware as much, that's my primary domain and i know these particular systems like the back of my hand, but i was kinda forced to go with them due to the colo pricing, gotta pack as much into that 1/4 rack as humanly possible.

alaricljs · Apr 24, 2024

Since you're sort of already 'in the business' try getting a full rack and subletting

CyklonDX · Apr 24, 2024

i386 said:
There was a front page post about a 2u4node system failure a few years ago where a power surge fried the nodes at the same. I think sth was down for a few hours/days.

@Patrick do you have a link to that post? I rember it but can't find it with the forum search or google (oldest post in main sites forum is from 2016...)

I'd say cudo's for not using ups for your servers.

Patrick · Apr 24, 2024

i386 said:
There was a front page post about a 2u4node system failure a few years ago where a power surge fried the nodes at the same. I think sth was down for a few hours/days.

@Patrick do you have a link to that post? I rember it but can't find it with the forum search or google (oldest post in main sites forum is from 2016...)

That was the worst! Super over-provisioned Kingston "data center" SSDs did not have power inrush protection so a surge in the Dell C6100 took out the drives in every node.

People wonder why I have become a fan of losing performance by using mixed brands of drives in RAID 1 in the hosting cluster. This is the big reason.

tinfoil3d · Apr 25, 2024

The only inrush issues i've had was luckily a simple inability of single 500W PSU to start 28 HDDs.
What drives those were, Patrick? And what was the likely cause?

jamesthetechie · Apr 27, 2024

Patrick said:
That was the worst! Super over-provisioned Kingston "data center" SSDs did not have power inrush protection so a surge in the Dell C6100 took out the drives in every node.

People wonder why I have become a fan of losing performance by using mixed brands of drives in RAID 1 in the hosting cluster. This is the big reason.

the C6100 was a wild system that required a blood sacrifice to be stable, i cant tell you how many times i sliced my hand open on one of those things, the C6400 is MUCH better in terms of stability.

Patrick · Apr 27, 2024

tinfoil3d said:
The only inrush issues i've had was luckily a simple inability of single 500W PSU to start 28 HDDs.
What drives those were, Patrick? And what was the likely cause?

It was a power freak-out in the chassis. I figured there was a forum thread on them.

The funny part @jamesthetechie is that the C6100 was a super high-volume line before many of the the hyper-scalers went OCP.

Search

Recommendations for Small Buisness configuration.

jamesthetechie

New Member

alaricljs

Active Member

CyklonDX

Well-Known Member

jamesthetechie

New Member

jamesthetechie

New Member

CyklonDX

Well-Known Member

jamesthetechie

New Member

CyklonDX

Well-Known Member

alaricljs

Active Member

i386

Well-Known Member

jamesthetechie

New Member

alaricljs

Active Member

CyklonDX

Well-Known Member

Patrick

Administrator

tinfoil3d

QSFP28

jamesthetechie

New Member

Patrick

Administrator