Hyper-V w/Failover cluster

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

seang86s

Member
Feb 19, 2013
164
16
18
Hey everyone. I have a two node Hyper-V cluster w/Failover clustering and S2D (2 way mirror for the storage) and I was trying some of the HA features. I have 10 Windows 10 VMs split across the two machines. When I simulate a hard failover (unplug the server), the 5 affected VMs go to an "unmonitored" state and they are down. I was hoping the VMs would move to the other node and boot up there. Instead, they stay down until the failed node returns and then they restart on that node.

How can I get the VMs to move to the automatically to the node that stays up? I understand they will reboot and the running state is not preserved. Actually, is there a way the running state can be preserved in that kind of failure and just magically slide over and continue to run on the other server?
 

cesmith9999

Well-Known Member
Mar 26, 2013
1,417
468
83
Does the node that is up have the resources to restart the VM's?

Remember that you cannot have an active/active setup if both nodes are running at greater than 50%

You should also look at the windows system logs. there is a lot of information there to help you figure out what happened.

Chris
 

seang86s

Member
Feb 19, 2013
164
16
18
It does. Only 10% CPU utilization and over 100 Gigs of 128 Gigs of RAM free.

The event log says: "Virtual machines on node 'NODEA' have entered an unmonitored state. The virtual machines health will be monitored again once the node returns from an isolated state or may failover if the node does not return. The virtual machine no longer being monitored are: Win10-301, Win10-303, Win10-309, Win10-307, Win10-305." Nothing else in the event log besides events about the node no longer being there.
 

seang86s

Member
Feb 19, 2013
164
16
18
The VMs were created via Failover Cluster Manager so I believe they are already marked for HA. If I go into Configure Role | Virtual Machine, none of the VMs show up in there. If I create a VM via Hyper-V Manager, then go into Failover Cluster Manger | Configure Role | Virtual Machine, the VM I just created in Hyper-V Manager shows up and once configured, shows in Failover Cluster Manager.

Interestingly enough, I've simulated a failure 3 times in an hour that the cluster put the failed node in Quarantine. The node that stayed up grabbed all the VMs and started them up. So it seems the node has the ability to take all the VMs but how do I set it up to do it as soon as a node fails?

Is System Center Virtual Machine Manager required to make these VM's truly HA? I'm from a VMWare world and vCenter is needed for this kind of functionality.
 

seang86s

Member
Feb 19, 2013
164
16
18
I already have that setting set to the second option (start if it was running before), but the delay is set to 0 seconds across all 10 VMs.
 

cesmith9999

Well-Known Member
Mar 26, 2013
1,417
468
83
Do the VM's failover properly if you do a normal reboot of the host?

In my past job we set all VM's to start automatically (3rd button) with staggered startup (at least 10 seconds apart).

Chris
 

DavidRa

Infrastructure Architect
Aug 3, 2015
329
152
43
Central Coast of NSW
www.pdconsec.net
Where's your quorum? Sounds like there's only 2 votes in the cluster so there's no way for the remaining node to figure out whether the failed node is failed, or just off the network. Build an Azure witness or a file share witness (on a separate physical machine).

You can't do the kind of failover you're looking for in Hyper-V (you'd have to run the VM on both nodes and be continually shipping memory state with zero latency and distributed writes - obviously that's not possible). You can do it with limitations in vSphere, but I've never seen anyone bother.
 

seang86s

Member
Feb 19, 2013
164
16
18
A graceful shutdown causes the VMs to Live Migrate to the remaining node. Their running state is preserved as expected.

A file share quorum is defined. The witness shows Online during the graceful shutdown (and restart of the server) as well as when I just pull the power on NodeA.

Also, when the node that performed the graceful shutdown returns to the cluster, the VMs failback. I do have failback enabled and well as a preferred owner defined for each VM. I originally didn't have failback and preferred owners defined and experienced the same behavior of whatever VMs were on the failed node going into an "unmonitored" state and would not start until the failed node returned back to the cluster.
 
Last edited:

cesmith9999

Well-Known Member
Mar 26, 2013
1,417
468
83
If you have only 2 nodes then you need to have a quorum on a different device. One that is not on the cluster. I would use a azure quorum device.

Chris