Hi,
Has anyone encountered stability issues with ESXi 6.0 with a datastore on OmniOS+Napp-it, where under a very specific situation, the NFS datastore would go down in the APD state (all paths down)? Everything's fine, power up VM, restart VM guest, even high IOPS load test. The trigger was when any VM with CBT turned on gets powered off.
It was very difficult to troubleshoot - at first I thought it was my HBA (an LSI), then suspecting my SSD pool (combo of Samsung 850 + Crucial MX200), and finally nailed it down (albeit not with 100% certainty) to the ESXi 6.0 NFS code being incompatible with OmniOS's NFS (and potentially other variants) under very specific conditions.
The symptom shown in the vmkernel.log is as follows. The following is shown immediately after issuing a Stop-VM command:
Has anyone else encountered this?
regards,
Uto
Has anyone encountered stability issues with ESXi 6.0 with a datastore on OmniOS+Napp-it, where under a very specific situation, the NFS datastore would go down in the APD state (all paths down)? Everything's fine, power up VM, restart VM guest, even high IOPS load test. The trigger was when any VM with CBT turned on gets powered off.
It was very difficult to troubleshoot - at first I thought it was my HBA (an LSI), then suspecting my SSD pool (combo of Samsung 850 + Crucial MX200), and finally nailed it down (albeit not with 100% certainty) to the ESXi 6.0 NFS code being incompatible with OmniOS's NFS (and potentially other variants) under very specific conditions.
The symptom shown in the vmkernel.log is as follows. The following is shown immediately after issuing a Stop-VM command:
2015-06-22T00:35:30.788Z cpu0:65098)CBT: 1468: Disconnecting the cbt device 1e01de-cbt with filehandle 1966558
2015-06-22T00:35:45.158Z cpu1:32860)StorageApdHandler: 1204: APD start for 0x4304dc32efa0 [12860b4d-5de423fb]
2015-06-22T00:35:45.158Z cpu0:32907)StorageApdHandler: 421: APD start event for 0x4304dc32efa0 [12860b4d-5de423fb]
2015-06-22T00:35:45.158Z cpu0:32907)StorageApdHandlerEv: 110: Device or filesystem with identifier [12860b4d-5de423fb] has entered the All Paths Down state.
2015-06-22T00:36:02.158Z cpu0:32819)WARNING: SwapExtend: vm 65098: 426: Failed to truncate swapfile /vmfs/volumes/12860b4d-5de423fb/wlc1/vmx-wlc1-1912640390-1.vswp to 0 bytes: I/O error
2015-06-22T00:36:04.158Z cpu1:33213)NFSLock: 612: Stop accessing fd 0x4301b5496bd0 3
2015-06-22T00:36:04.158Z cpu1:33213)NFSLock: 612: Stop accessing fd 0x4301b549fff0 3
201
2015-06-22T00:35:45.158Z cpu1:32860)StorageApdHandler: 1204: APD start for 0x4304dc32efa0 [12860b4d-5de423fb]
2015-06-22T00:35:45.158Z cpu0:32907)StorageApdHandler: 421: APD start event for 0x4304dc32efa0 [12860b4d-5de423fb]
2015-06-22T00:35:45.158Z cpu0:32907)StorageApdHandlerEv: 110: Device or filesystem with identifier [12860b4d-5de423fb] has entered the All Paths Down state.
2015-06-22T00:36:02.158Z cpu0:32819)WARNING: SwapExtend: vm 65098: 426: Failed to truncate swapfile /vmfs/volumes/12860b4d-5de423fb/wlc1/vmx-wlc1-1912640390-1.vswp to 0 bytes: I/O error
2015-06-22T00:36:04.158Z cpu1:33213)NFSLock: 612: Stop accessing fd 0x4301b5496bd0 3
2015-06-22T00:36:04.158Z cpu1:33213)NFSLock: 612: Stop accessing fd 0x4301b549fff0 3
201
Has anyone else encountered this?
regards,
Uto