Hello everyone! I came across your forum trying to find a solution to a problem I am having with my server and I thought I'd make a post and see if anyone here has any ideas.
Here is my setup:
Motherboard - ASRock 970 Extreme4
CPU - AMD FX-8320
Memory - 32GB (G.SKill Ripjaws X Series 4 x 8GB DDR3 1600 PC3 12800 (F3-1600C9Q-32GXM)
Drive Controller - IBM M1015 flashed to LSI SAS2008
NIC - Intel PRO/1000 Dual Port Server Adapter (disabled the onboard Realtek NIC)
On the M1015, I'm running 3 4TB Seagate NAS ST4000VN000 drives, 2 3TB Seagate drives (not sure the version) and 2 2TB Western Digital Green Drives. I have a couple 256GB SSDs and a 750GB WD plugged into the controllers on the motherboard.
Here's how the VM's breakout:
WHS2011 - 8GB RAM, 4 cores (1 socket), passed through the M1015 (with its 8 disks) and it has two virtual disks (1 on an SSD for the OS, one on the 750GB for torrents). I am running StableBit DrivePool. I was running FlexRAID (tried SnapRAID too), but I'll get to that later
Windows 7 VM - hosts an Emby server and I use this for playing around - 8GB Ram, 4 cores (1 socket), 1 Virtual disk on an SSD
Windows 7VM 2 - hosts a Plex Server - 8GB Ram, 4 cores (1 socket), 1 Virtual disk on an SSD
Windows Xp VM - 1GB Ram, 1 cores, 1 virtual disk on SSD (not usually running)
All three machines use the network adapter type VMXNET 3
So here is what happens. Occasionally, the server completely locks up. I cannot see it on my network (neither the Host or any of the machines). If I plug a monitor into the box, it seems like the Host is still working (I can plug a keyboard in and do the few things you can do that way), but I have no way of telling if the VMs are still running and just disconnected from the network or if they are all locked up (or only some of them are locked up). The only way to get back up and running is a hard reboot of the whole box.
I thought initially this was a networking issue so I tried a couple things. I tried different types of virtual network adapters. I then went and bought the Intel NIC I'm using now (I had been using the onboard Realtek despite warnings that ESXi doesn't play nice with them). It didn't change anything.
I was using FlexRAID for parity redundancy. I noticed that it was locking up often when I made big changes (like ripped a new Blu-Ray to my server), so I figured it might be something with FlexRAID. I ditched it for SnapRAID for a time, but the problem remained. I finally turned off both and just run FlexRAID manually once in a while to make sure I'm covered. Its hit and miss if it locks up.
I did go to the VMWare forums for help, but I never really got anywhere. I posted my logs but no one could really point me to what I should be looking for. The only thing I took from it was that it was an I/O issue relating to the drives, not a problem with the NICs (real and/or virtual). I have noticed since that it does seem to happen more when there are lots of reads and writes (like if I'm doing a bunch of torrents at the same time).
Beyond that, I'm stumped. I've upgraded to ESXi 6.0 (from 5.5) hoping that might make a difference, but it hasn't. Outside of this problem, I've really liked ESXi, but my inability to solve it is making me think of ditching it. I'm by no means a power user. I looked at maybe Windows Server Hyper-V, but frankly I find it too confusing to get it straight. I was looking tonight at setting up Windows 10 as the host by enabling Hyper-V and just doing that, but I'm not sure if that would be a good solution.
So, any ideas on how I can fix my ESXi setup? Any help would be greatly appreciated!
Here is my setup:
Motherboard - ASRock 970 Extreme4
CPU - AMD FX-8320
Memory - 32GB (G.SKill Ripjaws X Series 4 x 8GB DDR3 1600 PC3 12800 (F3-1600C9Q-32GXM)
Drive Controller - IBM M1015 flashed to LSI SAS2008
NIC - Intel PRO/1000 Dual Port Server Adapter (disabled the onboard Realtek NIC)
On the M1015, I'm running 3 4TB Seagate NAS ST4000VN000 drives, 2 3TB Seagate drives (not sure the version) and 2 2TB Western Digital Green Drives. I have a couple 256GB SSDs and a 750GB WD plugged into the controllers on the motherboard.
Here's how the VM's breakout:
WHS2011 - 8GB RAM, 4 cores (1 socket), passed through the M1015 (with its 8 disks) and it has two virtual disks (1 on an SSD for the OS, one on the 750GB for torrents). I am running StableBit DrivePool. I was running FlexRAID (tried SnapRAID too), but I'll get to that later
Windows 7 VM - hosts an Emby server and I use this for playing around - 8GB Ram, 4 cores (1 socket), 1 Virtual disk on an SSD
Windows 7VM 2 - hosts a Plex Server - 8GB Ram, 4 cores (1 socket), 1 Virtual disk on an SSD
Windows Xp VM - 1GB Ram, 1 cores, 1 virtual disk on SSD (not usually running)
All three machines use the network adapter type VMXNET 3
So here is what happens. Occasionally, the server completely locks up. I cannot see it on my network (neither the Host or any of the machines). If I plug a monitor into the box, it seems like the Host is still working (I can plug a keyboard in and do the few things you can do that way), but I have no way of telling if the VMs are still running and just disconnected from the network or if they are all locked up (or only some of them are locked up). The only way to get back up and running is a hard reboot of the whole box.
I thought initially this was a networking issue so I tried a couple things. I tried different types of virtual network adapters. I then went and bought the Intel NIC I'm using now (I had been using the onboard Realtek despite warnings that ESXi doesn't play nice with them). It didn't change anything.
I was using FlexRAID for parity redundancy. I noticed that it was locking up often when I made big changes (like ripped a new Blu-Ray to my server), so I figured it might be something with FlexRAID. I ditched it for SnapRAID for a time, but the problem remained. I finally turned off both and just run FlexRAID manually once in a while to make sure I'm covered. Its hit and miss if it locks up.
I did go to the VMWare forums for help, but I never really got anywhere. I posted my logs but no one could really point me to what I should be looking for. The only thing I took from it was that it was an I/O issue relating to the drives, not a problem with the NICs (real and/or virtual). I have noticed since that it does seem to happen more when there are lots of reads and writes (like if I'm doing a bunch of torrents at the same time).
Beyond that, I'm stumped. I've upgraded to ESXi 6.0 (from 5.5) hoping that might make a difference, but it hasn't. Outside of this problem, I've really liked ESXi, but my inability to solve it is making me think of ditching it. I'm by no means a power user. I looked at maybe Windows Server Hyper-V, but frankly I find it too confusing to get it straight. I was looking tonight at setting up Windows 10 as the host by enabling Hyper-V and just doing that, but I'm not sure if that would be a good solution.
So, any ideas on how I can fix my ESXi setup? Any help would be greatly appreciated!