Storage Spaces Uneven Disk Idle Time

The Gecko

Active Member
Jan 4, 2015
107
81
28
43
I have a Windows 2016 server with 2x 2TB Intel P3700 SSDs + 36x 8TB Spinning Rust. They were configured as a tiered RAID6 using Storage Spaces (not Storage Spaces Direct) using code similar (but not exactly the same) to this:

Bash:
$ErrorActionPreference = "Stop";
$StoragePoolFriendlyName = 'Middle Chassis 36 Bays';

$StoragePool = Get-StoragePool -FriendlyName $StoragePoolFriendlyName;
$PhysicalDisks = $StoragePool | Get-PhysicalDisk;

# NOTE:  Two of the $PhysicalDisks were marked as hot spares.

New-VirtualDisk -StoragePoolFriendlyName $StoragePoolFriendlyName `
                -FriendlyName "RAID6" `
                -UseMaximumSize `
                -ProvisioningType Fixed `
                -ResiliencySettingName "Parity" `
                -PhysicalDiskRedundancy 2;


# Create Storage Tiers
$SpinningRust = New-StorageTier -StoragePoolFriendlyName $StoragePoolFriendlyName `
                                -FriendlyName            SpinningRust `
                                -MediaType               HDD `
                                -ResiliencySettingName   Parity `
                                -PhysicalDiskRedundancy  2;
$NVMe       = New-StorageTier -StoragePoolFriendlyName $StoragePoolFriendlyName `
                                -FriendlyName            NVMe `
                                -MediaType               SSD `
                                -ResiliencySettingName   Mirror `
                                -PhysicalDiskRedundancy  1 `
                                -NumberOfDataCopies      2;

# Create Virtual Disk (2-way NVMe mirror + Dual-Parity HDD)
# Put the Virtual Disk creation code inside a loop so it finds the max size allowable.
$SizeTB = 190;
Do {
    $FailureEncountered = $False;
    Try {
        Write-Host "Trying $SizeTB";
        $VirtualDisk = New-Volume -StoragePoolFriendlyName   $StoragePoolFriendlyName `
                                  -FriendlyName              TieredRAID6 `
                                  -StorageTierFriendlyNames  $NVMe.FriendlyName,$SpinningRust.FriendlyName `
                                  -StorageTierSizes          900GB,($SizeTB*1TB) `
                                  -WriteCacheSize            900GB `
                                  -FileSystem                ReFS `
                                  -AccessPath                "D:" `
                                  -AllocationUnitSize        64KB;
    }
    Catch {
        $Error.Clear();
        $SizeTB = $SizeTB -1;
        $FailureEncountered = $True;
    }
   
} While ($FailureEncountered)
The virtual disk is formatted with ReFS.

1663689496999.png

This configuration works fine accepting incoming writes until the SSD tier fills up. After that performance tanks. I've spent some time looking at PerfMon counters trying to find something that stands out as wrong/unusual. I found it, but I don't understand it, and therefore need your help.

This is a picture of Physical Disk Idle Times (%) for the disks used in the Storage Pool. The two on the right-hand side I believe are the SSDs. Everything else is Spinning Rust. When the SSD tier fills up, it starts destaging/detiering data from SSD --> HDD. Since this is a RAID6/Double Parity volume that consumes all the available space (250TB), I would have expected all the Spinning Rust to show similar Idle Times. Instead, it looks like two of the drives are getting pounded while the other drives are mostly idle.

1663690832973.png

This suggests that I have misconfigured something and now I'm stuck with a performance-tanking bottleneck. What did I do wrong?
 

Attachments

cesmith9999

Well-Known Member
Mar 26, 2013
1,361
459
83
do you know what those disks are?

in the storage spaces Raid 6 like setup 1 disk is global parity.

Another thing to look at is are those disk throwing errors. usually when disks start looking like this the disks are starting to throw errors.

there is an event log for disks that are not writing in the time allotted

Chris
 

The Gecko

Active Member
Jan 4, 2015
107
81
28
43
The disks are HGST Ultrastar He8 8TB 3.5" 128MB 7200RPM SAS.
No errors are being thrown by the disks. I spent a lot of time inspecting them with the latest copy of Smartmontools.
There's zero entries in any of the Event Logs indicating a problem with the disks or SAS controllers.
 

Tayschrenn

New Member
Sep 22, 2020
6
0
1
What is your destage set to? Especially with double parity, you're going to have horrible writes if you're filling the ssd tier and not giving it time to destage. Use case varies too, my backup servers I run at 60% destage so I always have enough for incremental to hit ssd.

Also, what is your memory configuration - if you're not optimizing memory bandwidth you're going to hurt on storage spaces, as it's already a horrifically bad SDS code (just try a raid-0 on sas-ssd and see how bad )

Also, make sure you're marking the pool as power protected, or if won't leverage cache.
 

Tayschrenn

New Member
Sep 22, 2020
6
0
1
Try to configure the dual parity with Storage Pool through the GUI, here is a guide Create 1 Storage Pool with 3-Way Mirror Virtual Disk and attach to the Server

If you can upgrade to 2019 or 2022 Storage Spaces aka Pools are way better. Specially for 2022, you will enjoy the benefits of the new storage bus cache system. Storage bus cache on Storage Spaces
I just threw up in my mouth a little bit.

Please, don't advise someone to use the gui

That's horrible for most MAP builds.

And I'm... Not sure it's a good idea to use that storage bus cache - it looks like just another way for Microsoft to suck at SDS.

I'd highly recommend OP check the ispowerprotected and I'd also make sure they're using full memory channels. Microsoft Software defined storage is balls slow even doing simple (raid 0) and double parity does worse. We moved to a full mirror/mirror for ours instead of MAP because of the shitty offload to just single parity.
 

BloodKnight7

New Member
Nov 15, 2022
12
2
3
I just threw up in my mouth a little bit.

Please, don't advise someone to use the gui

That's horrible for most MAP builds.

And I'm... Not sure it's a good idea to use that storage bus cache - it looks like just another way for Microsoft to suck at SDS.

I'd highly recommend OP check the ispowerprotected and I'd also make sure they're using full memory channels. Microsoft Software defined storage is balls slow even doing simple (raid 0) and double parity does worse. We moved to a full mirror/mirror for ours instead of MAP because of the shitty offload to just single parity.
Get some tums for your reflux man... Tools are there to be used depending on the conditions. GUI or PS scripts, both have uses depending on the user's skill level and the scope of what its needed to be done. Keep it simple and stupid, specially for microsoft. In simple scenarios with a single server is way more efficient to use the storage pool wizard or do it through windows admin center than to write a PS script with values that a novice use can mess up. The GUI will contain most of the "general" recommended best practices that microsoft directs and it can be used as a baseline for further troubleshooting.
 

The Gecko

Active Member
Jan 4, 2015
107
81
28
43
I'm a bit rushed for time, so I cannot post the code I used to create this setup. I'll return to add more details later.

I had an opportunity to create a 3-node Windows Server 2022 failover cluster and host Storage Spaces Direct (SD2) upon it. All nodes were interconnected via dual ported 100GbE NICs to a 100GbE switch. Each server had 10 (9 + 1 spare) NVMe disks. The storage pool was configured to be a two-copy mirror. If you squint, it kinda looks like a RAID 10. The clustered disk was exposed to clients via Windows scale-out file server (SMB). I used FIO on six different machines to simultaneously blast the SMB share with a variety of writes.

In S2D, only one server owns any particular clustered virtual disk. At the time of this snapshot, the clustered virtual disk was owned by the blue server. The blue server was receiving all the incoming writes. I find it extremely interesting that again, one drive was getting pounded by the incoming writes despite this was essentially a RAID 10.

1669167912391.png
 

BloodKnight7

New Member
Nov 15, 2022
12
2
3
I'm a bit rushed for time, so I cannot post the code I used to create this setup. I'll return to add more details later.

I had an opportunity to create a 3-node Windows Server 2022 failover cluster and host Storage Spaces Direct (SD2) upon it. All nodes were interconnected via dual ported 100GbE NICs to a 100GbE switch. Each server had 10 (9 + 1 spare) NVMe disks. The storage pool was configured to be a two-copy mirror. If you squint, it kinda looks like a RAID 10. The clustered disk was exposed to clients via Windows scale-out file server (SMB). I used FIO on six different machines to simultaneously blast the SMB share with a variety of writes.

In S2D, only one server owns any particular clustered virtual disk. At the time of this snapshot, the clustered virtual disk was owned by the blue server. The blue server was receiving all the incoming writes. I find it extremely interesting that again, one drive was getting pounded by the incoming writes despite this was essentially a RAID 10.

View attachment 25714
What type of cards where you using? Qlogic? Mellanox? Did you used jumbo frames, iwarp or RoCE (depending on the hba for rdma) best practices on the switch and hosts?