I have a strange issue, on my ZFS Napp-it back-up storage.
When my back-up jobs are running the pool off the storage is most off the time busy @ 100% or almost 100% but the disk are busy around 50%.
I am using SATA spinning disk so they are already slow and I don't want the disk to be the bottle neck.
Can someone please advice me what is wrong with my setup?
Or how to troubleshoot? (I had a single lun and changed it to 2 luns but that didn't help)
Specs:
Supermicro Server 4U Chassis 846E16-R1200B
X8DTN+-F 24GB RAM xeon cpu
LSI2008 IT hba
2x 1TB rpool
16x 8TB 2x 8 Raidz2 in stripe Tank1
Dual port 10GB single link
Supermicro 4u
Pool
128KB
compression lz4
dedupe off
ZFS_ARC_MAX 17GB
2x ZFS volume 128KB iSCSI LUN
connected to
ESXi server with single VM Server 2016 (Veeam Back-up) with 8 cores, 88GB Ram, 10GB VMXNET3 iSCSI
# zpool status
pool: Tank1
config:
NAME STATE READ WRITE CKSUM
Tank1 ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
c0t5000C500A45A4389d0 ONLINE 0 0 0
c0t5000C500A45A73EAd0 ONLINE 0 0 0
c0t5000C500A45A75A2d0 ONLINE 0 0 0
c0t5000C500A45B6EB5d0 ONLINE 0 0 0
c0t5000C500A45D5DFFd0 ONLINE 0 0 0
c0t5000C500A45E6E92d0 ONLINE 0 0 0
c0t5000C500A45E8218d0 ONLINE 0 0 0
c0t5000C500A45EB962d0 ONLINE 0 0 0
raidz2-1 ONLINE 0 0 0
c0t5000C500A52BBA32d0 ONLINE 0 0 0
c0t5000C500A52BD754d0 ONLINE 0 0 0
c0t5000C500A52C33E7d0 ONLINE 0 0 0
c0t5000C500A52C80CAd0 ONLINE 0 0 0
c0t5000C500A52C9BD5d0 ONLINE 0 0 0
c0t5000C500A52F1CF1d0 ONLINE 0 0 0
c0t5000C500A5D84CFDd0 ONLINE 0 0 0
c0t5000C500A60E9679d0 ONLINE 0 0 0
errors: No known data errors
#iostat -x 1
device r/s w/s kr/s kw/s wait actv svc_t %w %b
rpool 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
Tank1 1220.0 1557.0 18759.8 83255.0 36.1 9.6 16.5 15 96
sd0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
sd1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
sd2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
sd3 79.0 107.0 1180.0 6587.9 0.0 0.6 3.0 0 36
sd4 80.0 109.0 1316.0 6567.9 0.0 0.5 2.6 0 32
sd5 79.0 107.0 1108.0 6563.9 0.0 0.8 4.2 0 42
sd6 88.0 114.0 1452.0 6567.9 0.0 0.7 3.3 0 42
sd7 33.0 107.0 320.0 6579.9 0.0 0.5 3.2 0 30
sd8 80.0 109.0 1368.0 6543.9 0.0 0.6 3.1 0 36
sd9 87.0 110.0 1468.0 6567.9 0.0 0.5 2.6 0 36
sd10 34.0 120.0 332.0 6547.9 0.0 0.3 2.1 0 23
sd11 74.0 87.0 828.0 3836.0 0.0 0.6 3.9 0 39
sd12 81.0 86.0 1396.0 3880.0 0.0 0.5 3.1 0 39
sd13 78.0 86.0 1296.0 3852.0 0.0 0.7 4.0 0 38
sd14 79.0 86.0 1308.0 3872.0 0.0 0.5 3.1 0 36
sd15 92.0 90.0 1560.0 3800.0 0.0 0.8 4.6 0 47
sd16 79.0 84.0 812.0 3844.0 0.0 0.7 4.4 0 43
sd18 94.0 83.0 1612.0 3808.0 0.0 0.6 3.6 0 37
sd19 83.0 88.0 1404.0 3836.0 0.0 0.6 3.5 0 38
# sar -d 1 10
device %busy avque r+w/s blks/s avwait avserv
stmf_lu_ 91 2.3 643 74852 0.2 3.4
stmf_lu_ 10 1.1 211 36170 0.0 5.3
stmf_tgt 92 3.5 854 111021 0.2 3.9
Tank1 89 47.6 3935 328682 8.7 3.3
most busy disk on that moment:
sd10 43 1.2 293 20429 0.0 4.1
# mpstat 5
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 102 0 40 512 107 3926 5 504 1057 21 82 0 8 0 92
1 183 0 46 157 11 3862 6 566 1296 24 178 1 8 0 92
2 271 0 35 141 8 2749 5 420 1161 24 236 1 7 0 93
3 278 0 43 147 9 2780 6 508 1110 24 222 1 7 0 92
4 293 0 38 3926 3803 2749 29 459 1339 20 251 1 9 0 90
5 363 0 33 1906 1766 2448 6 380 1081 23 316 1 8 0 91
6 355 0 45 901 761 3638 6 505 1250 23 273 1 8 0 92
7 207 0 43 188 10 3079 4 346 1754 17 123 0 9 0 90
# smartctl -a /dev/rdsk/c0t5000C500A60E9679d0
smartctl 6.6 2017-11-05 r4594 [i386-pc-solaris2.11] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: ST8000NM0055-1RM112
Serial Number: ZA19TZPY
LU WWN Device Id: 5 000c50 0a60e9679
Firmware Version: SN04
User Capacity: 8,001,563,222,016 bytes [8.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
# zfs get all Tank1
NAME PROPERTY VALUE SOURCE
Tank1 type filesystem -
Tank1 creation Wed Nov 22 21:41 2017 -
Tank1 used 33.3T -
Tank1 available 46.6T -
Tank1 referenced 188K -
Tank1 compressratio 1.35x -
Tank1 mounted yes -
Tank1 quota none default
Tank1 reservation none default
Tank1 recordsize 128K local
Tank1 mountpoint /Tank1 default
Tank1 sharenfs off default
Tank1 checksum on default
Tank1 compression lz4 local
Tank1 atime off local
Tank1 devices on default
Tank1 exec on default
Tank1 setuid on default
Tank1 readonly off default
Tank1 zoned off default
Tank1 snapdir hidden default
Tank1 aclmode discard default
Tank1 aclinherit restricted default
Tank1 canmount on default
Tank1 xattr on default
Tank1 copies 1 default
Tank1 version 5 -
Tank1 utf8only off -
Tank1 normalization none -
Tank1 casesensitivity sensitive -
Tank1 vscan off default
Tank1 nbmand off default
Tank1 sharesmb off default
Tank1 refquota none default
Tank1 refreservation none local
Tank1 primarycache all local
Tank1 secondarycache all local
Tank1 usedbysnapshots 119K -
Tank1 usedbydataset 188K -
Tank1 usedbychildren 33.3T -
Tank1 usedbyrefreservation 0 -
Tank1 logbias latency default
Tank1 dedup off default
Tank1 mlslabel none default
Tank1 sync standard local
Tank1 refcompressratio 1.00x -
Tank1 written 0 -
Tank1 logicalused 41.4T -
Tank1 logicalreferenced 36.5K -
Tank1 filesystem_limit none default
Tank1 snapshot_limit none default
Tank1 filesystem_count none default
Tank1 snapshot_count none default
Tank1 redundant_metadata all default
# zfs get all Tank1/ZFS2_Veeam2
NAME PROPERTY VALUE SOURCE
Tank1/ZFS2_Veeam2 type volume -
Tank1/ZFS2_Veeam2 creation Tue Apr 21 9:10 2020 -
Tank1/ZFS2_Veeam2 used 1.10T -
Tank1/ZFS2_Veeam2 available 46.6T -
Tank1/ZFS2_Veeam2 referenced 1.10T -
Tank1/ZFS2_Veeam2 compressratio 1.05x -
Tank1/ZFS2_Veeam2 reservation none default
Tank1/ZFS2_Veeam2 volsize 20T local
Tank1/ZFS2_Veeam2 volblocksize 128K -
Tank1/ZFS2_Veeam2 checksum on default
Tank1/ZFS2_Veeam2 compression lz4 inherited from Tank1
Tank1/ZFS2_Veeam2 readonly off default
Tank1/ZFS2_Veeam2 copies 1 default
Tank1/ZFS2_Veeam2 refreservation none default
Tank1/ZFS2_Veeam2 primarycache all inherited from Tank1
Tank1/ZFS2_Veeam2 secondarycache all inherited from Tank1
Tank1/ZFS2_Veeam2 usedbysnapshots 0 -
Tank1/ZFS2_Veeam2 usedbydataset 1.10T -
Tank1/ZFS2_Veeam2 usedbychildren 0 -
Tank1/ZFS2_Veeam2 usedbyrefreservation 0 -
Tank1/ZFS2_Veeam2 logbias latency default
Tank1/ZFS2_Veeam2 dedup off default
Tank1/ZFS2_Veeam2 mlslabel none default
Tank1/ZFS2_Veeam2 sync standard inherited from Tank1
Tank1/ZFS2_Veeam2 refcompressratio 1.05x -
Tank1/ZFS2_Veeam2 written 1.10T -
Tank1/ZFS2_Veeam2 logicalused 1.15T -
Tank1/ZFS2_Veeam2 logicalreferenced 1.15T -
Tank1/ZFS2_Veeam2 snapshot_limit none default
Tank1/ZFS2_Veeam2 snapshot_count none default
Tank1/ZFS2_Veeam2 redundant_metadata all default
When my back-up jobs are running the pool off the storage is most off the time busy @ 100% or almost 100% but the disk are busy around 50%.
I am using SATA spinning disk so they are already slow and I don't want the disk to be the bottle neck.
Can someone please advice me what is wrong with my setup?
Or how to troubleshoot? (I had a single lun and changed it to 2 luns but that didn't help)
Specs:
Supermicro Server 4U Chassis 846E16-R1200B
X8DTN+-F 24GB RAM xeon cpu
LSI2008 IT hba
2x 1TB rpool
16x 8TB 2x 8 Raidz2 in stripe Tank1
Dual port 10GB single link
Supermicro 4u
Pool
128KB
compression lz4
dedupe off
ZFS_ARC_MAX 17GB
2x ZFS volume 128KB iSCSI LUN
connected to
ESXi server with single VM Server 2016 (Veeam Back-up) with 8 cores, 88GB Ram, 10GB VMXNET3 iSCSI
# zpool status
pool: Tank1
config:
NAME STATE READ WRITE CKSUM
Tank1 ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
c0t5000C500A45A4389d0 ONLINE 0 0 0
c0t5000C500A45A73EAd0 ONLINE 0 0 0
c0t5000C500A45A75A2d0 ONLINE 0 0 0
c0t5000C500A45B6EB5d0 ONLINE 0 0 0
c0t5000C500A45D5DFFd0 ONLINE 0 0 0
c0t5000C500A45E6E92d0 ONLINE 0 0 0
c0t5000C500A45E8218d0 ONLINE 0 0 0
c0t5000C500A45EB962d0 ONLINE 0 0 0
raidz2-1 ONLINE 0 0 0
c0t5000C500A52BBA32d0 ONLINE 0 0 0
c0t5000C500A52BD754d0 ONLINE 0 0 0
c0t5000C500A52C33E7d0 ONLINE 0 0 0
c0t5000C500A52C80CAd0 ONLINE 0 0 0
c0t5000C500A52C9BD5d0 ONLINE 0 0 0
c0t5000C500A52F1CF1d0 ONLINE 0 0 0
c0t5000C500A5D84CFDd0 ONLINE 0 0 0
c0t5000C500A60E9679d0 ONLINE 0 0 0
errors: No known data errors
#iostat -x 1
device r/s w/s kr/s kw/s wait actv svc_t %w %b
rpool 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
Tank1 1220.0 1557.0 18759.8 83255.0 36.1 9.6 16.5 15 96
sd0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
sd1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
sd2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
sd3 79.0 107.0 1180.0 6587.9 0.0 0.6 3.0 0 36
sd4 80.0 109.0 1316.0 6567.9 0.0 0.5 2.6 0 32
sd5 79.0 107.0 1108.0 6563.9 0.0 0.8 4.2 0 42
sd6 88.0 114.0 1452.0 6567.9 0.0 0.7 3.3 0 42
sd7 33.0 107.0 320.0 6579.9 0.0 0.5 3.2 0 30
sd8 80.0 109.0 1368.0 6543.9 0.0 0.6 3.1 0 36
sd9 87.0 110.0 1468.0 6567.9 0.0 0.5 2.6 0 36
sd10 34.0 120.0 332.0 6547.9 0.0 0.3 2.1 0 23
sd11 74.0 87.0 828.0 3836.0 0.0 0.6 3.9 0 39
sd12 81.0 86.0 1396.0 3880.0 0.0 0.5 3.1 0 39
sd13 78.0 86.0 1296.0 3852.0 0.0 0.7 4.0 0 38
sd14 79.0 86.0 1308.0 3872.0 0.0 0.5 3.1 0 36
sd15 92.0 90.0 1560.0 3800.0 0.0 0.8 4.6 0 47
sd16 79.0 84.0 812.0 3844.0 0.0 0.7 4.4 0 43
sd18 94.0 83.0 1612.0 3808.0 0.0 0.6 3.6 0 37
sd19 83.0 88.0 1404.0 3836.0 0.0 0.6 3.5 0 38
# sar -d 1 10
device %busy avque r+w/s blks/s avwait avserv
stmf_lu_ 91 2.3 643 74852 0.2 3.4
stmf_lu_ 10 1.1 211 36170 0.0 5.3
stmf_tgt 92 3.5 854 111021 0.2 3.9
Tank1 89 47.6 3935 328682 8.7 3.3
most busy disk on that moment:
sd10 43 1.2 293 20429 0.0 4.1
# mpstat 5
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 102 0 40 512 107 3926 5 504 1057 21 82 0 8 0 92
1 183 0 46 157 11 3862 6 566 1296 24 178 1 8 0 92
2 271 0 35 141 8 2749 5 420 1161 24 236 1 7 0 93
3 278 0 43 147 9 2780 6 508 1110 24 222 1 7 0 92
4 293 0 38 3926 3803 2749 29 459 1339 20 251 1 9 0 90
5 363 0 33 1906 1766 2448 6 380 1081 23 316 1 8 0 91
6 355 0 45 901 761 3638 6 505 1250 23 273 1 8 0 92
7 207 0 43 188 10 3079 4 346 1754 17 123 0 9 0 90
# smartctl -a /dev/rdsk/c0t5000C500A60E9679d0
smartctl 6.6 2017-11-05 r4594 [i386-pc-solaris2.11] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: ST8000NM0055-1RM112
Serial Number: ZA19TZPY
LU WWN Device Id: 5 000c50 0a60e9679
Firmware Version: SN04
User Capacity: 8,001,563,222,016 bytes [8.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
# zfs get all Tank1
NAME PROPERTY VALUE SOURCE
Tank1 type filesystem -
Tank1 creation Wed Nov 22 21:41 2017 -
Tank1 used 33.3T -
Tank1 available 46.6T -
Tank1 referenced 188K -
Tank1 compressratio 1.35x -
Tank1 mounted yes -
Tank1 quota none default
Tank1 reservation none default
Tank1 recordsize 128K local
Tank1 mountpoint /Tank1 default
Tank1 sharenfs off default
Tank1 checksum on default
Tank1 compression lz4 local
Tank1 atime off local
Tank1 devices on default
Tank1 exec on default
Tank1 setuid on default
Tank1 readonly off default
Tank1 zoned off default
Tank1 snapdir hidden default
Tank1 aclmode discard default
Tank1 aclinherit restricted default
Tank1 canmount on default
Tank1 xattr on default
Tank1 copies 1 default
Tank1 version 5 -
Tank1 utf8only off -
Tank1 normalization none -
Tank1 casesensitivity sensitive -
Tank1 vscan off default
Tank1 nbmand off default
Tank1 sharesmb off default
Tank1 refquota none default
Tank1 refreservation none local
Tank1 primarycache all local
Tank1 secondarycache all local
Tank1 usedbysnapshots 119K -
Tank1 usedbydataset 188K -
Tank1 usedbychildren 33.3T -
Tank1 usedbyrefreservation 0 -
Tank1 logbias latency default
Tank1 dedup off default
Tank1 mlslabel none default
Tank1 sync standard local
Tank1 refcompressratio 1.00x -
Tank1 written 0 -
Tank1 logicalused 41.4T -
Tank1 logicalreferenced 36.5K -
Tank1 filesystem_limit none default
Tank1 snapshot_limit none default
Tank1 filesystem_count none default
Tank1 snapshot_count none default
Tank1 redundant_metadata all default
# zfs get all Tank1/ZFS2_Veeam2
NAME PROPERTY VALUE SOURCE
Tank1/ZFS2_Veeam2 type volume -
Tank1/ZFS2_Veeam2 creation Tue Apr 21 9:10 2020 -
Tank1/ZFS2_Veeam2 used 1.10T -
Tank1/ZFS2_Veeam2 available 46.6T -
Tank1/ZFS2_Veeam2 referenced 1.10T -
Tank1/ZFS2_Veeam2 compressratio 1.05x -
Tank1/ZFS2_Veeam2 reservation none default
Tank1/ZFS2_Veeam2 volsize 20T local
Tank1/ZFS2_Veeam2 volblocksize 128K -
Tank1/ZFS2_Veeam2 checksum on default
Tank1/ZFS2_Veeam2 compression lz4 inherited from Tank1
Tank1/ZFS2_Veeam2 readonly off default
Tank1/ZFS2_Veeam2 copies 1 default
Tank1/ZFS2_Veeam2 refreservation none default
Tank1/ZFS2_Veeam2 primarycache all inherited from Tank1
Tank1/ZFS2_Veeam2 secondarycache all inherited from Tank1
Tank1/ZFS2_Veeam2 usedbysnapshots 0 -
Tank1/ZFS2_Veeam2 usedbydataset 1.10T -
Tank1/ZFS2_Veeam2 usedbychildren 0 -
Tank1/ZFS2_Veeam2 usedbyrefreservation 0 -
Tank1/ZFS2_Veeam2 logbias latency default
Tank1/ZFS2_Veeam2 dedup off default
Tank1/ZFS2_Veeam2 mlslabel none default
Tank1/ZFS2_Veeam2 sync standard inherited from Tank1
Tank1/ZFS2_Veeam2 refcompressratio 1.05x -
Tank1/ZFS2_Veeam2 written 1.10T -
Tank1/ZFS2_Veeam2 logicalused 1.15T -
Tank1/ZFS2_Veeam2 logicalreferenced 1.15T -
Tank1/ZFS2_Veeam2 snapshot_limit none default
Tank1/ZFS2_Veeam2 snapshot_count none default
Tank1/ZFS2_Veeam2 redundant_metadata all default