Alternatives to vmWare vSan for hyperconverged environment (Home)

ecosse

Active Member
Jul 2, 2013
377
66
28
Agree, I should have elaborated on this point. My concern is not having one node as a failure domain which is completely normal for 2-node configuration, but the fact that all disks in the cluster are also fallen under the FTT=1. For example, with HPE VSA or StarWind vSAN I can configure the local redundancy using hardware RAID controller. Imagine 2-node all-flash setup where you can totally lose one of the nodes and 2 disks in the other node with the little to no impact on production availability.

I know, that S2D features the self-healing capability using the "Reserved Capacity" mechanism. However, until the rebalancing is finished, the production would have zero redundancy.

That's why I think the "good-enough" S2D cluster starts from 4 all-flash nodes in mixed-resiliency configuration (mirror+parity).
I hear you; I wanted a small storage spaces config for a backup unit I was building in my lab but it just didn't make sense compared to whacking in a local RAID in if one accepts a single node with resilience as opposed to two nodes without local resilience. I'm somewhat surprised the Starwind vSAN supports h/w RAID - I really need to look at that at some stage.
I run a HP VSA in my lab; its RAID-10 throughout
 

cheezehead

Active Member
Sep 23, 2012
717
174
43
WI
HPE VSA needs a third entity called "Failover Manager" to maintain quorum on a two-node config.

HPE StoreVirtual – Managers and Quorum

HPE VSA only needs the FOM VM on a separate box if your setup in a 2-node cluster. If it's a larger cluster then you can play with the active managers to maintain quorum. Also worth noting, if quorum is lost you can still manually active node via cli.
 
  • Like
Reactions: NISMO1968

dwright1542

Active Member
Dec 26, 2015
364
70
28
47
In late 2015 we went thru the whole 2 node Hyperconverged for SBS, Starwind, Stormagic, Datacore, VSAN. Stormagic came out on top, by far, mostly for the cleanliness of the install and setup. We actually bought a copy of Starwind, and it did have a bit better performance because of tiered caching, when Windows wasn't getting in the way. Apparently they figured that out as well, since they now have a linux version. Stormagic now has tiered caching as well, although we REALLY liked the Pernix FVP Freedom before that bit the dust.
 

Rand__

Well-Known Member
Mar 6, 2014
4,572
910
113
StorMagic looks interesting but they seem to have dropped their "Free 2TB" offering a while ago :/
Contacted them for pricing but a bit steep for @ Home.
 
Last edited:

Net-Runner

Member
Feb 25, 2016
79
24
8
37
looking for info on
hardware recommendations (ie how can I reuse which existing components)
AFAIK there are no strict hardware recommendations since Starwind is quite hardware agnostic. As soon as all the drives and the OS are OK everything should be fine however I would refer to their hardware offerings Hyperconvergence for ROBO and SMB • StarWind HCA which look like to be DELL based on reference hardware.

-can I use the Linux appliance with free or do I need windows
-does the Linux appliance support IB devices /RDMA
I have tested their Linux appliance and it seems to be working but since it's a first version I am not likely to rely on it right now. Moreover, it is not officially production allowed yet. Not sure about RDMA stuff in Linux version but it works great in a Windows-based configuration.

-Which tiering do they have
-Do I need Raid controller or not ...
Starwind itself is not capable of combining physical drives into a single storage array so you either need a RAID controller that will do the job. Another option is Storage Spaces (not the Direct ones). That is basically what I've done. I have two physical servers with Storage Spaces automated tiering and Starwind that does mirroring on top. Works totally fine.
 

Evan

Well-Known Member
Jan 6, 2016
3,123
522
113
I guess you have already decided hyper converged is also the best approach ? Did you also consider changing track and making one server a storage master and just doing replication ? It will require manual intervention to get back running if you have a failure though.
 
  • Like
Reactions: Net-Runner

Rand__

Well-Known Member
Mar 6, 2014
4,572
910
113
Have not found an easy/convenient real time replication option for next to free. Still waiting for @gea to integrate into Napp-it;)
 

Net-Runner

Member
Feb 25, 2016
79
24
8
37
I guess you have already decided hyper converged is also the best approach ? Did you also consider changing track and making one server a storage master and just doing replication ? It will require manual intervention to get back running if you have a failure though.
I do, I really like the approach and the performance it offers along with cost saving. And I am just crazy about automatic failover so active-active is the only right for me. It's not something our business really needs but it's something everyone really like. Especially our IT team :)
 

gea

Well-Known Member
Dec 31, 2010
2,514
852
113
DE
Have not found an easy/convenient real time replication option for next to free. Still waiting for @gea to integrate into Napp-it;)
I am playing with a ZFS Mirror over two iSCSI targets for a storage and service failover between two nodes. This works in current dev with a manual failover but I am not yet satisfied with performance and stability in case of a power outage with one node.

The uncertainty about OmniOS vs OpenIndiana vs SmartOS gives an additional delay.
 

NISMO1968

[ ... ]
Oct 19, 2013
78
13
8
San Antonio, TX
www.vmware.com
1) What's really interesting - first time I've played with their Linux VSA was early 2010 or so. I wonder why did it take them SO LONG to come up with an obvious thing? People HATE Windows-running storage boxes! Whatever... Datacor doesn't get it :)

2) Their tiering engine is crap. It starts reasonably well for v1.0, but if you keep filling your volume with data they start moving hot and cold chunks back and force to effectively kill their already sparse IOPS.

3) It's a pity Nutanix swallowed them for their customer base (?) to actually kill great product :(

In late 2015 we went thru the whole 2 node Hyperconverged for SBS, Starwind, Stormagic, Datacore, VSAN. Stormagic came out on top, by far, mostly for the cleanliness of the install and setup. We actually bought a copy of Starwind, and it did have a bit better performance because of tiered caching, when Windows wasn't getting in the way. Apparently they figured that out as well, since they now have a linux version. Stormagic now has tiered caching as well, although we REALLY liked the Pernix FVP Freedom before that bit the dust.
 

NISMO1968

[ ... ]
Oct 19, 2013
78
13
8
San Antonio, TX
www.vmware.com
Interesting! Now they actually sell something others (StarWind & HPE & NTNX & ... ) give away for free! :)

BTW, HP has their free VSA being 1TB (?) limited and if you're lucky (?) to own HP gear you might ask your sales rep for a favor :)

StorMagic looks interesting but they seem to have dropped their "Free 2TB" offering a while ago :/
Contacted them for pricing but a bit steep for @ Home.
 

NISMO1968

[ ... ]
Oct 19, 2013
78
13
8
San Antonio, TX
www.vmware.com
You're right about that! I'd never put anything with manual failover into production though.

HPE VSA only needs the FOM VM on a separate box if your setup in a 2-node cluster. If it's a larger cluster then you can play with the active managers to maintain quorum. Also worth noting, if quorum is lost you can still manually active node via cli.
 

NISMO1968

[ ... ]
Oct 19, 2013
78
13
8
San Antonio, TX
www.vmware.com
It's not about performance, it's about a) security, and b) resilience. Remember, there were operating systems (QNX Neutrino? Plan9?) where even file system drivers run in user space and fs crash wasn't something putting whole system on the knees? In our case if hypervisor boots from USB stick, loads itself into RAM disk image and uses "boot" filesystem, why should it crash if say NIC driver BSODs or PSODs? NIC drive reloaded, few frames or pings dropped (maybe, if there's no LACP or MPIO/M/C) and that's it!

I don't think it is - direct IO path is not in kernel and yet offers excellent performance. Struggling to understand the relevance of Xbox (one).
 

ecosse

Active Member
Jul 2, 2013
377
66
28
It's not about performance, it's about a) security, and b) resilience. Remember, there were operating systems (QNX Neutrino? Plan9?) where even file system drivers run in user space and fs crash wasn't something putting whole system on the knees? In our case if hypervisor boots from USB stick, loads itself into RAM disk image and uses "boot" filesystem, why should it crash if say NIC driver BSODs or PSODs? NIC drive reloaded, few frames or pings dropped (maybe, if there's no LACP or MPIO/M/C) and that's it!
I am really really struggling with your thought process - what point are you trying to make? Your original reply stated a Nutanix link I quoted in an HCI storage architecture was outdated. The way I read it, you're now supporting what they are saying (i.e. keep the kernel lean and run everything not critical in user mode.)
 

ecosse

Active Member
Jul 2, 2013
377
66
28
Nope;) Got away from branded stuff a while ago due to lack of flexibility. Too much tinkering on my end ;)
Thing its, my HP Dl380 gen8 is rock solid. I can never quite say the same for the Intel S2600cp2 based rigs I've built even if they are super flexible.
 

NISMO1968

[ ... ]
Oct 19, 2013
78
13
8
San Antonio, TX
www.vmware.com
I apologize, it might be too much of the Napa red wine :) OK, let me clarify on that...

Running I/O sensitive things inside a VM is RIGHT if you do it right: passing hardware directly to VM, SR-IOV-ing it (with a proper hypervisor support of course). This is what Intel (SPDK and DPDK), nVidia, Microsoft (X-BOX) & Mellanox are doing now. Running everything inside a VM with a paravirtualized I/O (something Nutanix is doing) is WRONG.

Did I make it any better?

I am really really struggling with your thought process - what point are you trying to make? Your original reply stated a Nutanix link I quoted in an HCI storage architecture was outdated. The way I read it, you're now supporting what they are saying (i.e. keep the kernel lean and run everything not critical in user mode.)
 

msvirtualguy

Active Member
Jan 23, 2013
473
236
43
msvirtualguy.com
Actually Nutanix does PCI-E passthrough where the CVM has direct access to the HBA(s) accross the PCI-E bus and therefore disks. I'm beginning to think you either do not take the time to get the right information purposely or you just like to throw stones and FUD. Either way, i'm not playing until you put some effort into understanding what it is we do. This is the 4th post now, where you have made unfounded claims and FUD, you work for HP?