Does vsphere support any kind of hardware watchdog?

dswartz

Active Member
Jul 14, 2011
393
33
28
Running a 6.5 host, Build 6765664. Running fine since update to this build on 12/10. This morning around 10AM, it became unresponsive. I was out of town. so I was unable to check anything until just now. The IPMI console showed everything apparently okay, except for the host being unresponsive (not even to pings.) I rebooted it, and it came up fine, but I'd kinda like to avoid hangs in the future. I've been searching via google, but I don't see any kind of hardware watchdog support. Is there such a thing? It's a single host, so HA won't help me here. Thanks!
 

pricklypunter

Well-Known Member
Nov 10, 2015
1,606
470
83
Canada
I have never used it but is that not a bios feature, that you can then enable in ESXi? I know that HP at one time touted that feature, perhaps others too :)
 

dswartz

Active Member
Jul 14, 2011
393
33
28
I don't think I was clear. Yes, it's a HW feature, but the host OS has to know how to use it. e.g. enable it, and have some kind of periodic thread that resets the timer often enough to not have it trip. I know how to do this with various linux distros, but not idea for ESXi :(
 

dswartz

Active Member
Jul 14, 2011
393
33
28
I have no idea - that's why I was asking :) If there is something there, that would be helpful. I didn't know where to even start looking. Googling for this was useless :(
 

pricklypunter

Well-Known Member
Nov 10, 2015
1,606
470
83
Canada
I'm probably not much more help than that, I have never used it myself. If you don't have access to the hardware monitor information, I think you need to enable the cim service, but that's about all I know and even that's prolly on the wrong track, sorry :D
 

Rand__

Well-Known Member
Mar 6, 2014
4,494
878
113
Hm had a quick search (ESX watchdog) but I think the CIM/WBEM hardware monitor is only for displaying hardware specific information (storage especially) on the box.
You could play with the BIOS option to see whether this could communicate with the cim agent in ESX o/c...

If all else fails the external ping monitor on a RPi would be minimal additional cost/effort.
 

Dawg10

Associate
Dec 24, 2016
217
112
43
I believe the hardware monitoring functions you're looking for are included in the VMware Operations Manager module, but it's designed to run as a VM and look outward; if the host goes down you're SOL. My books on ESXi 5.5 and 6.0 make no mention of watchdog functions, and the only mention of heartbeat is in reference to datastores and HA.

I use the web service uptimerobot.com to check the status of my domain, home IP and ISP gateway. The service is free for 50 monitors at 5 minute intervals and will email me when there's an issue.
 

pricklypunter

Well-Known Member
Nov 10, 2015
1,606
470
83
Canada
From reading a bit on this today, confusing as hell, both Dell and HP appear to be using IPMI and the BMC WDT in some way, but I'm at a loss in understanding exactly what they are doing. I can only guess that they are using a VM or specific scripts in their distribution to start/ restart or reset the BMC WDT using IPMI commands. I don't think the BMC interface itself is available to ESXi, at least not without some third party intervention.

I can see potential issues with the whole idea of having a hard WDT on the host anyway, namely if a particular VM doing the WDT reset had hung, but the rest of the system was fine and up and running, it would cause the host to reboot taking down however many VM's in the process, perhaps critical ones, depending on what you have running/ doing, so I would think it's use has as much potential to wreak havoc as it has to help :)
 

Rand__

Well-Known Member
Mar 6, 2014
4,494
878
113
I would think that Dell/HP will use their own management system which access the local CIM agent on the box and they can query that from their management console. Dont think this is intended as a one box show ...
 

pricklypunter

Well-Known Member
Nov 10, 2015
1,606
470
83
Canada
Yea, I agree. Whatever the big guys are doing to implement this feature, it's going to be a proprietory solution for sure :)