Good hell, my rear-end hurts...just took a massive chewing from 'the Mrs.' on introducing performance degradation/instability for her VDI while she was mid-editing of a clients session.
Here's the backstory of my boneheaded mis-adventures over the last couple of hours.
FedEx arrives, delivering new 8643 to 8087 cable to hook off my free port on my 9300-8i to four more slots on my sc216...ok 'maintenance time'
Begin to sVMotion roughly 700GB of VM's totaling about 40-50 VM's across my 3-node cluster...all goes well for quite some time, I am watching 'zpool iostat -v poolname 2' and things are crusing along. Skip ahead 30-45 mins or so, I see the sVMotions start to mostly wrap up except my wife's VDI session times out, a VDI RDS server stalls at 72% migrated, all others finish. Cannot pull up console on wife's VDI VM or RDP, attempt a reboot...go away vSphere tells me...ok reset...go away...shutdown...no dice. I start to panic and think hell maybe a services.sh restart from a ssh'ed in session to the ESXi hypervisor host that the two stalled VM's are on may fix...no fix...look to login directly to the host and attempt shutdown...no luv. Walk upstairs to get back to some 'real work' and hear the horric sound none of us wanna hear...
VVVVRRRRRROOOOOMMMMMMMMM as the server stack starts to spin/ramp up LOUD. FARK I think to myself, how in the HELL did my stack just reboot...small power blip or just bad luck, wife says nothing else looped in house and I didn't notice either. I walk away knowing vSphere autostart/HA will take care of me, back 15 mins later and I am up, sVMotions failed back gracefully as they should have but left some cruft on the dest NFS stg, cleaned up no biggie...look at hosts expecting to see all three uptimes of 15-20 mins and it's ONLY the two hosts that were running AIO FreeNAS VM's and cooresponding ESXi hosts doing HEAVY sVmotion I/O operations that had a heart attack.
WTF I thinks to myself, that was freaking weird, other host that is another FreeNAS AIO up for 49 days...EX4300 switch clearly reboot as well.
Any ideas??? Just a bad IT juju day? CREEPY in my book! NEVER had that happen to me.
Good news is I am fully up and running, I was scrambling to look at snapshots/replications to see where I was gonna restore from if needed, had them from a week back locally (kicsk self that i did not snap prior to sVmotion activities today) but also had them on another remote system from the same date stamp (double CYA), unnecessary thank goodness!
Here's the backstory of my boneheaded mis-adventures over the last couple of hours.
FedEx arrives, delivering new 8643 to 8087 cable to hook off my free port on my 9300-8i to four more slots on my sc216...ok 'maintenance time'
Begin to sVMotion roughly 700GB of VM's totaling about 40-50 VM's across my 3-node cluster...all goes well for quite some time, I am watching 'zpool iostat -v poolname 2' and things are crusing along. Skip ahead 30-45 mins or so, I see the sVMotions start to mostly wrap up except my wife's VDI session times out, a VDI RDS server stalls at 72% migrated, all others finish. Cannot pull up console on wife's VDI VM or RDP, attempt a reboot...go away vSphere tells me...ok reset...go away...shutdown...no dice. I start to panic and think hell maybe a services.sh restart from a ssh'ed in session to the ESXi hypervisor host that the two stalled VM's are on may fix...no fix...look to login directly to the host and attempt shutdown...no luv. Walk upstairs to get back to some 'real work' and hear the horric sound none of us wanna hear...
VVVVRRRRRROOOOOMMMMMMMMM as the server stack starts to spin/ramp up LOUD. FARK I think to myself, how in the HELL did my stack just reboot...small power blip or just bad luck, wife says nothing else looped in house and I didn't notice either. I walk away knowing vSphere autostart/HA will take care of me, back 15 mins later and I am up, sVMotions failed back gracefully as they should have but left some cruft on the dest NFS stg, cleaned up no biggie...look at hosts expecting to see all three uptimes of 15-20 mins and it's ONLY the two hosts that were running AIO FreeNAS VM's and cooresponding ESXi hosts doing HEAVY sVmotion I/O operations that had a heart attack.
WTF I thinks to myself, that was freaking weird, other host that is another FreeNAS AIO up for 49 days...EX4300 switch clearly reboot as well.
Any ideas??? Just a bad IT juju day? CREEPY in my book! NEVER had that happen to me.
Good news is I am fully up and running, I was scrambling to look at snapshots/replications to see where I was gonna restore from if needed, had them from a week back locally (kicsk self that i did not snap prior to sVmotion activities today) but also had them on another remote system from the same date stamp (double CYA), unnecessary thank goodness!