I have two LSI SAS2008 9212-4i4e in my server. Both with FW 20.00.04.00. One in IR mode, used for ESXi datastore. One in IT for passthrough to a VM (FreeNAS).
The LSI card in IR mode has two Intel SSD DC S3700 400GB drives in a RAID 1 array.
I did a secure erase of the Intel drives on a separate PC, then constructed the array from MSM (MegaRAID Storage Manager) in Windows. When initialization was completed, I moved the LSI card and the SSDs to the server.
After the server booted into ESXi, the array was showing up as Enabled/Optimal, and no operation ongoing. Creating the vmfs partition took longer than expected, perhaps a minute or two. Looking at the latency of the storage array, it's abysmal! Average read and write latency is in the range of 80-200 ms!
Using Microsoft's diskspd benchmarking tool, I often had no more than 300-800 IOPS at 4kB random IO, and the latency was approximately 75% below 1-2 ms, and 25% at 180+ ms.
This was continuous for several hours, after which I left the array with only a single Windows 10 VM for a day and a half. After letting it sit, the array had stabilized, and I now had reasonable latency with averages of 1-3 ms. Worst-case latency during 4 and 8 kB read/write random IOPS benchmarking using diskspd was 20 ms, as expected with S3700 drives.
Performance was great for several days. Until I restarted the server yesterday. After a warm restart of the server, the array is once more providing terrible performance. Status still shows optimal. I would equate the performance to the same performance I've seen during background initialization.
Intel SSD DC S3700 have the latest firmware 5DV12270.
Looking at the SAS-2 Integrated RAID Solution User Guide it lists a number of automatic background tasks.
SAS-2 Integrated RAID Solution User Guide (975 KB)
I've tried two different VMware drivers, and both seem to behave the same.
When the array is working fine, I can reach close to 100k IOPS for 100% 4kB read, and 50k IOPS for 100% 4kB write.
Would these background tasks be triggered after a warm reboot, and not show up as ongoing in the status read by sas2ircu? For example I believe I haven't seen the background initialization status shown previously, when creating an array from sas2ircu.
As I was not booting from them, I previously had not activated the Option ROM for the LSI cards in the BIOS. Would this be a possible culprit?
Any possible way to configure this array for better performance, especially with SSDs?
I'm leaning towards the LSI card only being useful in IT mode, and perhaps I should just run the drives without RAID mirroring, or an all-in-one solution with the FreeNAS VM supplying the storage via iSCSI.
The LSI card in IR mode has two Intel SSD DC S3700 400GB drives in a RAID 1 array.
I did a secure erase of the Intel drives on a separate PC, then constructed the array from MSM (MegaRAID Storage Manager) in Windows. When initialization was completed, I moved the LSI card and the SSDs to the server.
After the server booted into ESXi, the array was showing up as Enabled/Optimal, and no operation ongoing. Creating the vmfs partition took longer than expected, perhaps a minute or two. Looking at the latency of the storage array, it's abysmal! Average read and write latency is in the range of 80-200 ms!
Code:
$ /opt/lsi/bin/sas2ircu 0 status
LSI Corporation SAS2 IR Configuration Utility.
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved.
Background command progress status for controller 0...
IR Volume 1
Volume ID : 286
Current operation : None
Volume status : Enabled
Volume state : Optimal
Volume wwid : 084ba55b27b4b25b
Physical disk I/Os : Not quiesced
SAS2IRCU: Command STATUS Completed Successfully.
SAS2IRCU: Utility Completed Successfully.
This was continuous for several hours, after which I left the array with only a single Windows 10 VM for a day and a half. After letting it sit, the array had stabilized, and I now had reasonable latency with averages of 1-3 ms. Worst-case latency during 4 and 8 kB read/write random IOPS benchmarking using diskspd was 20 ms, as expected with S3700 drives.
Performance was great for several days. Until I restarted the server yesterday. After a warm restart of the server, the array is once more providing terrible performance. Status still shows optimal. I would equate the performance to the same performance I've seen during background initialization.
Code:
$ /opt/lsi/bin/sas2ircu 0 display
LSI Corporation SAS2 IR Configuration Utility.
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved.
Read configuration has been initiated for controller 0
------------------------------------------------------------------------
Controller information
------------------------------------------------------------------------
Controller type : SAS2008
BIOS version : 7.39.00.00
Firmware version : 20.00.04.00
Channel description : 1 Serial Attached SCSI
Initiator ID : 0
Maximum physical devices : 255
Concurrent commands supported : 1720
Slot : 4
Segment : 0
Bus : 1
Device : 0
Function : 0
RAID Support : Yes
------------------------------------------------------------------------
IR Volume information
------------------------------------------------------------------------
IR volume 1
Volume ID : 286
Volume Name : Intel_RAID1
Status of volume : Okay (OKY)
Volume wwid : 084ba55b27b4b25b
RAID level : RAID1
Size (in MB) : 380516
Physical hard disks :
PHY[0] Enclosure#/Slot# : 1:4
PHY[1] Enclosure#/Slot# : 1:5
------------------------------------------------------------------------
Physical device information
------------------------------------------------------------------------
Initiator at ID #0
Device is a Hard disk
Enclosure # : 1
Slot # : 4
SAS Address : 4433221-1-0400-0000
State : Optimal (OPT)
Size (in MB)/(in sectors) : 381554/781422767
Manufacturer : ATA
Model Number : INTEL SSDSC1NA40
Firmware Revision : 2270
Serial No : BTTV3172006Y400BGN
GUID : 50015178f3610b3b
Protocol : SATA
Drive Type : SATA_SSD
Device is a Hard disk
Enclosure # : 1
Slot # : 5
SAS Address : 4433221-1-0500-0000
State : Optimal (OPT)
Size (in MB)/(in sectors) : 381554/781422767
Manufacturer : ATA
Model Number : INTEL SSDSC1NA40
Firmware Revision : 2270
Serial No : BTTV31720202400BGN
GUID : 50015178f3611312
Protocol : SATA
Drive Type : SATA_SSD
------------------------------------------------------------------------
Enclosure information
------------------------------------------------------------------------
Enclosure# : 1
Logical ID : 500605b0:04640ee0
Numslots : 8
StartSlot : 0
------------------------------------------------------------------------
SAS2IRCU: Command DISPLAY Completed Successfully.
SAS2IRCU: Utility Completed Successfully.
Looking at the SAS-2 Integrated RAID Solution User Guide it lists a number of automatic background tasks.
SAS-2 Integrated RAID Solution User Guide (975 KB)
Perhaps I should not have, but after the most recent reboot and the high latency I waited a few minutes and then manually triggered a consistency check. It started but runs incredibly slow at a mere 2% progress per hour! Just over 2 MB/s! Amazing! The array is almost unused by ESXi according to esxtop with perhaps 5-10 operations per second on average (more like bursts of 20-100 every 3-5 seconds). Delay averages have increased 2-3 times, to 200-800 ms.2.4.5 Media Verification
The Integrated RAID firmware supports a background media verification feature that runs at regular intervals when the mirrored volume is in the Optimal state. If the verification command fails for any reason, the firmware reads the other disk’s data for this segment and writes it to the failing disk in an attempt to refresh the data. The firmware periodically writes the current media verification logical block address to nonvolatile memory so the media verification can continue from where it stopped prior to a power cycle.
2.4.10 Make Data Consistent
If it is enabled in the Integrated RAID firmware, the make data consistent (MDC) process starts automatically and runs in the background when you move a redundant volume from one LSI SAS-2 controller to another LSI SAS-2 controller. MDC compares the data on the primary and secondary disks. If MDC finds inconsistencies, it copies data from the primary disk to the secondary disk.
I've tried two different VMware drivers, and both seem to behave the same.
Code:
$ esxcli software vib update -v /vmfs/volumes/vmstore0_ssd0/scsi-mpt2sas-20.00.01.00-1OEM.550.0.0.1331820.x86_64.vib
Installation Result
Message: The update completed successfully, but the system needs to be rebooted for the changes to be effective.
Reboot Required: true
VIBs Installed: Avago_bootbank_scsi-mpt2sas_20.00.01.00-1OEM.550.0.0.1331820
VIBs Removed: Avago_bootbank_scsi-mpt2sas_20.00.00.00.1vmw-1OEM.550.0.0.1331820
VIBs Skipped:
Would these background tasks be triggered after a warm reboot, and not show up as ongoing in the status read by sas2ircu? For example I believe I haven't seen the background initialization status shown previously, when creating an array from sas2ircu.
As I was not booting from them, I previously had not activated the Option ROM for the LSI cards in the BIOS. Would this be a possible culprit?
Any possible way to configure this array for better performance, especially with SSDs?
I'm leaning towards the LSI card only being useful in IT mode, and perhaps I should just run the drives without RAID mirroring, or an all-in-one solution with the FreeNAS VM supplying the storage via iSCSI.
Last edited: