strange behavior of zfs pool

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

TRACKER

Active Member
Jan 14, 2019
180
56
28
Hello,

i observe strange issue with one of my pools.

I am running solaris 11.4 on physical machine (my home lab) and one of the pools behaves strangely.

The pool consists of 4 x Samsung 860 QVO (1TB drives).
I use it as a datastore for couple of VMs (sharing is done via iscsi to esxi host).

Since recently, when i use the pool (e.g. couple of VMs are running), i get "(slow)" in one of the drives.
At one time it is drive 3, during next boot it is drive 4, etc. So it is different drive every time.

No errors in system logs, no errors in SMART data.
During the reads, disk marked with 'slow' is not reading data, but it is writing data.

Do you have any ideas what could cause the issue?

P.S. I saw there is oracle document, describing the issue, but i cannot access it as i don't have support account.

Thanks a lot!
 

Attachments

gea

Well-Known Member
Dec 31, 2010
3,161
1,195
113
DE
I cannot comment about the Oracle document but would first suggest to check

- fillrate of pool?
- performance (if you have napp-it, run Pool > Benchmark
= a series of filebench benchmarks with sync enabled vs disabled)
- iostat, check load/busy of SSDs, are they quite similar?
- HBA (type and firmware)

In general, desktop SSDs behave poor on steady load and when quite full.
 

TRACKER

Active Member
Jan 14, 2019
180
56
28
Hello gea,

thanks for the advice :)

- fillrate of the pool ~60%
- i don't have any plugins installed, it is pure solaris 11.4 GA
-iostat does not show anything unusual, if i run scrub, it is reading the data from all disks, it is just not reading data from (slow) disk during normal operation(s)
- HBA - IBM 1110 (4 port SAS/SATA, based on LSI SAS 2004)

Actually, after some time (slow) disappeared and the pool began to read data from that drive.
I haven't changed anything, the OS automatically "fixed" it.

Yes, i know desktop SSDs are not very good choice, but i got 4 TB for 80 euro, so...it was really a "good deal" :)
I know also 860 QVO are QLC drives with 40GB SLC cache. From my usage of these drives for last 8 months, i can tell they behave decently well
during heavy loads (e.g. SAP HANA consistency checks for two DBs, running in parallel).

I forgot to mention, i use the pool with zVOL and iscsi (via comstar), connected to two ESXi hosts running 6.7 U3
 

Stephan

Well-Known Member
Apr 21, 2017
929
706
93
Germany
Is your ZFS new enough to support TRIM for the underlying devices? Like zpool trim? Or something like zpool set autotrim=on? Maybe your SSDs feel a little choked full and need some hints about garbage collection.
 

TRACKER

Active Member
Jan 14, 2019
180
56
28
Hello Stephan, i have to dig into that deeper, as i am not sure how to check if trim is enabled (or even supported on sata disks in solaris 11.4, i assume it is supported).
My understanding is that "choking" could happen when there are lots of writes, but in my case there weren't lots of writes, but reads instead.