Slow power on of VMs...

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

JimPhreak

Active Member
Oct 10, 2013
553
55
28
Technically you can, but the datastore browser doesn't behave like a file shared log file might do. Nothing to stop you downloading it after the fact and reading it at your leisure but I generally find it helps to view these things in realtime.

SSHing into the host and opening the log file with something like `tail -f /vmfs/volumes/somevol/somevm/vmware.log` and you'll get an almost-realtime log output into your terminal window (I think it refreshes every 1s by default) and this can be a big help in spotting issues as they occur. VMware logs are hideous enough that examining them after the fact is a great way to convince yourself that you need to drown that pack of ibuprofen in cheap gin.
I can't tell you how many times this doc has methodically and logically resolved issues for me.

http://communities.vmware.com/servl...952/vsphere41-performance-troubleshooting.pdf

Hope that link works, if not just google 'vSphere performance Troubleshooting Guide' and focus in on the flow chart (starting on page 15) for whatever resource you feel or find evidence of being constrained. I know the guide is a bit old (focusing on vSphere 4.1) but i could not find an updated one and the troubleshooting methodology is spot on still.

These may be useful as well...kiss your day goodbye. Good stuff w/in, I GUARANTEE it!

http://pubs.vmware.com/vsphere-51/t...er-server-51-monitoring-performance-guide.pdf

https://www.vmware.com/pdf/Perf_Best_Practices_vSphere5.1.pdf

VMware KB: Troubleshooting ESX/ESXi virtual machine performance issues
Thank you both for providing great resources for me to look into. Looks like I've got some HW to do. I'll check back in once I have some time to dive into this and get some useful data to go off of.
 

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,394
511
113
Everyone's a n00b at first...! :) Reading the docs is a great way to start but the only way you ever really get to geek out on a product is when you get weird "this shouldn't be happening" problems like this - something that vmware gets more than its fair share of simply because of its fairly mind-boggling complexity.

But yeah... this is interesting because you've pretty much got a KISS setup with some very fast (albeit fairly bleeding edge) IO so I'm kinda in the dark about why ESX would be waiting for 45s to bring a guest up. Actually, just to rule that out, have you got any other storage presented to this box that you can try moving a VM to to see if the issue occurs there?
 

JimPhreak

Active Member
Oct 10, 2013
553
55
28
Everyone's a n00b at first...! :) Reading the docs is a great way to start but the only way you ever really get to geek out on a product is when you get weird "this shouldn't be happening" problems like this - something that vmware gets more than its fair share of simply because of its fairly mind-boggling complexity.

But yeah... this is interesting because you've pretty much got a KISS setup with some very fast (albeit fairly bleeding edge) IO so I'm kinda in the dark about why ESX would be waiting for 45s to bring a guest up. Actually, just to rule that out, have you got any other storage presented to this box that you can try moving a VM to to see if the issue occurs there?
That was my first thought when I noticed this issue since I had to manually install the drivers for the SATA AHCI controller in order to get the Samsung SM951 M.2 drive to show up as a device in VMware. However I've already tried moving VM's to the Intel 730 480GB SSD I've got as a separate datastore and the issue still exists.
 

JimPhreak

Active Member
Oct 10, 2013
553
55
28
Couple thinks I just noticed in my vmkernal.log. First off the timestamps in all my logs are 6 hours ahead of my ESXi host which is configured with NTP. Can't get the log times to change no matter what I do with the ESXi host (disable NTP, change NTP servers, restart service, etc.)

But anyway, in the logs I'm seeing these messages every time I have a USB device passed through to a VM. They stop the moment I shut off that VM (this goes for my unRAID VM or a Windows VM).

2015-06-11T16:14:41.552Z cpu10:717671)<6>usb 2-1.2: Device is allocated for USB passthrough use; not available for VMkernel use
2015-06-11T16:14:41.668Z cpu0:717671)<6>usb 2-1.2: reset high speed USB device number 10 using ehci_hcd
2015-06-11T16:14:42.284Z cpu7:717544)<6>usb 2-1.2: device is available for passthrough


Then I just noticed this in the log (there are multiple of these relating to more than one SSD):

2015-06-11T16:20:00.049Z cpu1:32902)WARNING: LinScsi: SCSILinuxQueueCommand:1207: queuecommand failed with status = 0x1056 Unknown status vmhba0:0:0:0 (driver name: ahci) - Message repeated 1 time
2015-06-11T16:20:00.049Z cpu15:35120)ScsiDeviceIO: 2324: Cmd(0x412e85e7e080) 0x28, CmdSN 0x39c1c from world 32902 to dev "t10.ATA_____SATA_SSD________________________________96D707531A2400148195" failed H:0x0 D:0x8 P:0x0 Possible sense data: 0x0 0x0 0x0.
2015-06-11T16:20:00.467Z cpu2:32799)ScsiDeviceIO: 2324: Cmd(0x412e80812640) 0x85, CmdSN 0x4b7a from world 0 to dev "t10.ATA_____SATA_SSD________________________________96D707531A2400148195" failed H:0x0 D:0x8 P:0x0 Possible sense data: 0x0 0x0 0x0.


Not sure what these mean or what they're affecting if anything
 

JimPhreak

Active Member
Oct 10, 2013
553
55
28
Just want to report back that my issues with power on seem to have been resolved. It looks like it was as simple as re-seating my M1015 PCIE card that did the trick. Who knew?!
 
  • Like
Reactions: Shadow.X

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,394
511
113
Aha, sorry Jim I'd missed your post from thursday - and yes, as you rightly surmised those error messages would have given rise to freezes on your local datastores; even if you don't have owt plugged into the M1015 I suspect it would still cause freezes on the whole SCSI stack with a knock-on effect for AHCI gubbins which would explain the long power-up time whilst vmkernel waits to make sure the device is reachable. Glad you've got it sorted with a simple fix anyhoo.
 

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
Just want to report back that my issues with power on seem to have been resolved. It looks like it was as simple as re-seating my M1015 PCIE card that did the trick. Who knew?!
Funny but glad it was an easy fix once...kinda odd that the issue did not manifest/surface to esxtop stats. Hmm color me perplexed.
 

JimPhreak

Active Member
Oct 10, 2013
553
55
28
Funny but glad it was an easy fix once...kinda odd that the issue did not manifest/surface to esxtop stats. Hmm color me perplexed.
Well none of the disks attached to my M1015 were in use by VMware. The only disks I setup datastores on were connected to my on board controllers.
 

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
Well none of the disks on my M1015 were in use by VMware. Only disks I setup datastores on were connected to my on board controllers.
So you moved them to the M1015 or just simply re-seated HBA ctrl and the onboard started playing nice/becomming responsive to VM pwr on operations.
 

JimPhreak

Active Member
Oct 10, 2013
553
55
28
So you moved them to the M1015 or just simply re-seated HBA ctrl and the onboard started playing nice/becomming responsive to VM pwr on operations.
All I did was reseat the PCIE card. When I first got the card I only had a standard PCI bracket for it not realizing the SuperMicro chassis was low profile. So I actually had the M1015 in the slot with no bracket. Once I got the low profile bracket I re-seated the M1015 and now everything is working.
 

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
All I did was reseat the PCIE card. When I first got the card I only had a standard PCI bracket for it not realizing the SuperMicro chassis was low profile. So I actually had the M1015 in the slot with no bracket. Once I got the low profile bracket I re-seated the M1015 and now everything is working.
Wow and cool...that's a SMH moment for sure...just what is the M1015 in there for then? Future expansion? AIO? Pile o' SSD's to form and all-flash-array...Do tell.
 

JimPhreak

Active Member
Oct 10, 2013
553
55
28
Wow and cool...that's a SMH moment for sure...just what is the M1015 in there for then? Future expansion? AIO? Pile o' SSD's to form and all-flash-array...Do tell.
Attached to the M1015 are 4 x 8TB Seagate shingled drives + 1 x Intel 730 SSD that I'm passing through to my unRAID storage VM.
 
  • Like
Reactions: whitey