HP SL230s, H220i HBA, ESXi 6, shitty I/O

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

msg7086

Active Member
May 2, 2017
423
148
43
36
Hi Guys!

Just finished putting the S6500+SL230s together, and installed ESXi 6 on some of those boxes. However I noticed that I/O speed in ESXi is terrible. dd 10MB of data and it just won't finish.

Looking at vmkernel.log, got these:

Code:
2017-06-15T01:57:18.839Z cpu6:33157)<3>mpt2sas0: fault_state(0x7c22)!                                                                                                           
2017-06-15T01:57:18.839Z cpu6:33157)<6>mpt2sas0: sending diag reset !!                                                                                                          
2017-06-15T01:57:20.185Z cpu0:33363)WARNING: LinScsi: SCSILinuxQueueCommand:1260: queuecommand failed with status = 0x1055 Host Busy vmhba0:0:0:0 (driver name: Fusion MPT SAS Host) - Message repeated 1 time
2017-06-15T01:57:20.185Z cpu5:33157)NMP: nmp_ThrottleLogForDevice:3298: Cmd 0x2a (0x43b580be3940, 32786) to dev "naa.5000c50063a51266" on path "vmhba0:C0:T0:L0" Failed: H:0x0 D:0x8 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:NONE
2017-06-15T01:57:20.185Z cpu5:33157)ScsiDeviceIO: 2613: Cmd(0x43b580be3940) 0x2a, CmdSN 0xa5 from world 32786 to dev "naa.5000c50063a51266" failed H:0x0 D:0x8 P:0x0 Possible sense data: 0x0 0x0 0x0.                              
2017-06-15T01:57:21.563Z cpu4:33143)ScsiDeviceIO: 2613: Cmd(0x43b580bc1480) 0x28, CmdSN 0x615 from world 32915 to dev "naa.5000c50063a51266" failed H:0x0 D:0x8 P:0x0 Possible sense data: 0x0 0x0 0x0.                             
2017-06-15T01:57:21.615Z cpu8:33151)NMP: nmp_ThrottleLogForDevice:3231: last error status from device naa.5000c50063a51266 repeated 10 times                                                                                        
2017-06-15T01:57:22.656Z cpu8:35420)NMP: nmp_ThrottleLogForDevice:3231: last error status from device naa.5000c50063a51266 repeated 20 times                                                                                        
2017-06-15T01:57:23.185Z cpu8:35420)ScsiDeviceIO: 2613: Cmd(0x43b580badf80) 0x2a, CmdSN 0xae from world 32786 to dev "naa.5000c50063a51266" failed H:0x0 D:0x8 P:0x0 Possible sense data: 0x0 0x0 0x0.                              
2017-06-15T01:57:23.480Z cpu4:33157)<6>mpt2sas0: diag reset: SUCCESS                                                                                                                                                                
2017-06-15T01:57:23.480Z cpu8:35420)NMP: nmp_ThrottleLogForDevice:3248: last error status from device naa.5000c50063a51266 repeated 20 times                                                                                        
2017-06-15T01:57:23.480Z cpu8:35420)NMP: nmp_ThrottleLogForDevice:3298: Cmd 0x93 (0x43b580be2140, 32914) to dev "naa.5000c50063a51266" on path "vmhba0:C0:T0:L0" Failed: H:0x8 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2017-06-15T01:57:23.480Z cpu8:35420)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.5000c50063a51266" state in doubt; requested fast path state update...                                                       
2017-06-15T01:57:23.487Z cpu8:35420)NMP: nmp_ThrottleLogForDevice:3298: Cmd 0x28 (0x43b580bc1480, 32915) to dev "naa.5000c50063a51266" on path "vmhba0:C0:T0:L0" Failed: H:0x0 D:0x8 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:NONE
2017-06-15T01:57:23.689Z cpu8:35420)NMP: nmp_ThrottleLogForDevice:3231: last error status from device naa.5000c50063a51266 repeated 10 times                                                                                        
2017-06-15T01:57:23.689Z cpu8:35420)ScsiDeviceIO: 2613: Cmd(0x43b580c5ed00) 0x2a, CmdSN 0xb5 from world 32786 to dev "naa.5000c50063a51266" failed H:0x0 D:0x8 P:0x0 Possible sense data: 0x0 0x0 0x0.                              
2017-06-15T01:57:23.761Z cpu8:35420)ScsiDeviceIO: 2613: Cmd(0x43b580c5eb80) 0x2a, CmdSN 0xb6 from world 32786 to dev "naa.5000c50063a51266" failed H:0x0 D:0x8 P:0x0 Possible sense data: 0x0 0x0 0x0.                              
2017-06-15T01:57:24.164Z cpu8:33148)NMP: nmp_ThrottleLogForDevice:3231: last error status from device naa.5000c50063a51266 repeated 20 times                                                                                        
2017-06-15T01:57:24.898Z cpu4:33157)<6>mpt2sas0: LSISAS2308: FWVersion(11.10.07.00), ChipRevision(0x01), BiosVersion(07.20.10.00)                                                                                                   
2017-06-15T01:57:24.898Z cpu4:33157)<6>mpt2sas0: HP H220i Host Bus Adapter                                                                                                                                                          
<6>mpt2sas0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ2017-06-15T01:57:24.898Z cpu4:33157))                                                                           
2017-06-15T01:57:24.898Z cpu4:33157)<6>mpt2sas0: sending port enable !!                                                                                                                                                             
2017-06-15T01:57:25.204Z cpu5:32789)NMP: nmp_ThrottleLogForDevice:3231: last error status from device naa.5000c50063a51266 repeated 40 times                                                                                        
2017-06-15T01:57:26.762Z cpu5:32789)ScsiDeviceIO: 2613: Cmd(0x43b580c033c0) 0x2a, CmdSN 0xbf from world 32786 to dev "naa.5000c50063a51266" failed H:0x0 D:0x8 P:0x0 Possible sense data: 0x0 0x0 0x0.                              
2017-06-15T01:57:27.021Z cpu5:32789)NMP: nmp_ThrottleLogForDevice:3231: last error status from device naa.5000c50063a51266 repeated 80 times                                                                                        
2017-06-15T01:57:29.185Z cpu8:33363)WARNING: HBX: 1266: Lease has expired on vol datastore1 [c: 187622 m: 187622]                                                                                                                   
2017-06-15T01:57:29.206Z cpu7:32802)ScsiDeviceIO: 2595: Cmd(0x43b580be2140) 0x93, CmdSN 0xa4 from world 32914 to dev "naa.5000c50063a51266" failed H:0x8 D:0x0 P:0x0                                                                
2017-06-15T01:57:29.206Z cpu7:32802)ScsiDeviceIO: 3695: Restricting cmd 0x8a (1048576 bytes) from WID 32914 to quiesced dev naa.5000c50063a51266:3 (vmkCmd=0x43b580be2140 cmdId: 0x4301c6116800, 0xc0)
2017-06-15T01:57:29.206Z cpu7:32802)ScsiDeviceIO: 2595: Cmd(0x43b580be2140) 0x8a, CmdSN 0xc0 from world 32914 to dev "naa.5000c50063a51266" failed H:0x8 D:0x0 P:0x0                                 
2017-06-15T01:57:29.206Z cpu3:32914)FS3DM: 2767: status IO was aborted by VMFS via a virt-reset on the device zeroing 1 extents (1048576 each)                                                       
2017-06-15T01:57:29.206Z cpu3:32914)J3: 3302: Aborting txn (0x4305d667ab20) callerID: 0xc1d00006 due to failure pre-committing: IO was aborted by VMFS via a virt-reset on the device                
2017-06-15T01:57:29.206Z cpu3:33379)HBX: 2851: 'datastore1': HB at offset 3346432 - Waiting for timed out HB:                                                                                        
2017-06-15T01:57:29.206Z cpu3:33379)  [HB state abcdef02 offset 3346432 gen 11 stampUS 175622045 uuid 5941e8ea-1475f50b-3818-2c4138ebd654 jrnl <FB 2315600> drv 14.61 lockImpl 3]                    
2017-06-15T01:57:29.287Z cpu7:33511)ScsiDeviceIO: 2613: Cmd(0x43b580bae100) 0x16, CmdSN 0x5fe from world 0 to dev "naa.5000c50063a51266" failed H:0x0 D:0x8 P:0x0 Possible sense data: 0x0 0x0 0x0.  
2017-06-15T01:57:29.815Z cpu10:33011)NMP: nmp_ThrottleLogForDevice:3231: last error status from device naa.5000c50063a51266 repeated 160 times                                                       
2017-06-15T01:57:31.208Z cpu3:32914)HBX: 2851: 'datastore1': HB at offset 3346432 - Waiting for timed out HB:                                                                                        
2017-06-15T01:57:31.208Z cpu3:32914)  [HB state abcdef02 offset 3346432 gen 11 stampUS 175622045 uuid 5941e8ea-1475f50b-3818-2c4138ebd654 jrnl <FB 2315600> drv 14.61 lockImpl 3] 

(more...)
I'm tried ESXi 6U2 and ESXi 6U3 HP ver, and the mpt2sas module version is:
vmkload_mod module information
input file: /usr/lib/vmware/vmkmod/mpt2sas
Version: Version 15.10.06.00, Build: 1331820, Interface: 9.2 Built on: Jul 23 2015

and

vmkload_mod module information
input file: /usr/lib/vmware/vmkmod/mpt2sas
Version: Version 19.00.00.00.1vmw, Build: 2494585, Interface: 9.2 Built on: Feb 5 2015


Tried switch between SATA Legacy and SATA AHCI, tried turn B320i on and off (B320i doesn't work anyway).

Also tried 3 different blade nodes and all behave the same.

Disk speed is perfect when booting off a Linux LiveCD.

Any ideas?
 
Last edited:

msg7086

Active Member
May 2, 2017
423
148
43
36
Works fine on ESXi 6.5.

Drivers used:
vmkload_mod -s lsi_msgpt2
vmkload_mod module information
input file: /usr/lib/vmware/vmkmod/lsi_msgpt2
Version: 20.00.01.00-3vmw.650.0.0.4564106
Build Type: release
License: Proprietary

Instead of mpt2sas it uses lsi_msgpt2 driver. Any way to port lsi_msgpt2 driver back to 6.0?
 
Last edited:

msg7086

Active Member
May 2, 2017
423
148
43
36
The controller happened to have a firmware 11.10.07.00.
Tried updated to 15.10.10.00 and it's running much better.

However the I/O is still not optimal, dd 1GB of data uses 1 minutes, too slow for the 4TB seagate drive.

The lsi-msgpt2 gives me 100MB/s while this does barely 16MB/s.
 
Last edited:

msg7086

Active Member
May 2, 2017
423
148
43
36
Installing async drivers in ESXi 5.x/6.x using esxcli and offline bundle (2137853) | VMware KB

Looks like you can install drivers from the upgrade bundle which should have the 6.5 drivers in it.
Thank you. The async driver is interesting.

Well I just did more tests, put a linux on it and did some dd inside guest OS. 20MB/s for a lazy zeroed disk, and 135MB/s for an eager zeroed disk.

Code:
VIBs Installed: Avago_bootbank_scsi-mpt2sas_15.10.07.00-1OEM.550.0.0.1331820
VIBs Removed: Avago_bootbank_scsi-mpt2sas_15.10.06.00-1OEM.550.0.0.1331820
Indeed they are both async drivers.
 
Last edited:

msg7086

Active Member
May 2, 2017
423
148
43
36
Tried another installation, ESXi 6.0 inbox 19.x driver gives me 182MB/s on lazy zero.

Also FYI, the reported temperature of the HD-Controller goes down to 47C from 68C after I flashed the latest firmware.
 

msg7086

Active Member
May 2, 2017
423
148
43
36
Alright, things are getting worse. I deployed a vCenter on the box, and the disk image gets broken before its first boot. Did fsck on the volume and saw tons of errors.

Any ideas are welcomed!
 
Last edited:

nthu9280

Well-Known Member
Feb 3, 2016
1,628
498
83
San Antonio, TX
I know this is an old thread but I was troubleshooting something else (my post couple of days ago) and ran across the following.
***
Troubleshooting native drivers in ESXi 5.5 or later

VMware Knowledge Base
ESXi 5.5 introduces a new Native Device Driver Architecture Part 2

If you are running ESXi 5.5 or higher AND the mpt3sas driver was installed. It’s required to disable the lsi_msgpt3 native driver in order to use the mpt3sas driver. Run the following command then reboot the system:

esxcli system module set --enabled=false --module=lsi_msgpt3
***

I needed mpt2sas and needed to disable lsi_msgpt2.