Long time lurker, first time poster, yadda yadda. I'll do my best to get all the pertinent data in here, sorry in advance for the length.
I've built a test rig to evaluate using ZFS as a backup and archival storage target for those items which do not make sense to store in one of our various enterprise SANs or our hyperconverged environments. The primary goal is capacity, secondarily performance, and finally some semblance of resilience.
The current build is a collection of lab hardware, not necessarily indicative of the final design goal BUT enough to test some of our workload targets:
HP DL380 gen 7 chassis
Dual socket 6c/12t 2.2GHz Xeon v3's
144GB DDR3 1066MHz ECC memory
Dual LSI 9200's (HP rebranded), each with dual 6Gbps Mini-SAS external ports
HP i420 SmartArray card, BBU and caching disabled, 8 x SATA interface (four per controller)
Booting from a mirrored pair of 32Gb Samsung USB3 keys for now
Four HP SAS drive trays, can't remember the model, each one holds Qty 12 x 3.5" SAS drives (each tray connected by one 6Gbps SAS cable)
Qty 48 2TB 7200rpm HP SAS drives strewn about the four trays
Qty 2 Intel S3710 400GB SATA drives, each individual drive configured as a standalone RAID0 device in the SmartArray card, since the controller doesn't support JBOD natively. Thus, both drives are exposed individually, even though I lose SMART. Yeah, I know, it's not indicative of final. It isn't write-caching.
Oh, and running FreeNAS 11.2u1 (the current build as of this writing.)
Built the main pool out of eight RaidZ2 vdevs of 6 drives each (keeping with the recommended 2n + 2 config) and did NOT configure SLOG yet. I kept all the ZFS defaults of record size = 128k, lz4 compression, no dedupe, atime on, sync=default.
From the shell interface of FreeNAS itself (I'm not remote) I ran a simple DD if=/dev/zero of=/mnt/tank/blahddfile bs=4k count=102400
Performance was great, picked up something like 2.8Gbytes/sec. Between the RAM and compressing zeroes being really easy, this isn't a realistic figure. But hey, it's a data point.
Went back to storage config, disabled compression, disabled atime, ran the test again. Now we're seeing 800Mbytes/sec -- which is more inline with what I probably expected to see. Obviously streaming zeros still isn't indicative of anything truly useful, but I'm really just sanity testing for now.
Then I added the pair of s3710's as a mirrored SLOG device and set the pool to sync=always. I re-ran the DD command. And I waited. And I waited.
And I waited.
WTF is it broken? I control-C'd the task, and it barfed out 13 mbytes/sec.
Used the command line to rip out the SLOG from the pool. Used DD to write zeros to the SSD drives, built new partitions, built a new pool with just the SSD drives themselves in a mirrored set, mounted the new SSD-only pool. This new pool is again using the defaults: record size = 128k, sync=default, lz4, atime on, no dedupe. Reran the DD command against the new SSD pool, and BOOM winky: 2.8Gbytes / sec. Yay, RAM and compression.
Turned off compression, turned off atime, ran the test again. 550mbytes/sec.
Forced sync=always. Reran the test. 330mbytes/sec. Let's be honest: these S3700's aren't optanes, 330Mbytes/sec is absolutely OK in my book. And since I'm forcing sync on a pair of mirrored drives without SLOG, I'm incurring double writes on these two drives to get to the 330Mb/sec number. Honestly, this is probably better than I deserve on these drives.
Ok, the drives are good. Maybe it was just a gremlin in the original pool config somehow? I blew away the SSD pool, re-added the drives back into the main TANK pool as mirrored SLOG, retested with sync=always.
Can you guess? Did you guess 13mbytes/sec? I didn't, but that's what it was.
What.
In.
The.
Eff.
Recap: I used the same pair of S3710's as a mirrored pool on their own, with sync forced, and they'll knock down 330mbytes/sec. I literally use the same drives, in the same mirror configuration, but now as a SLOG to a larger pool, and they wallow in at 13mbytes/sec. If I rip out SLOG entirely and test that main pool, it will knock down >800Mbytes/sec.
Something is broken. Thoughts?
I've built a test rig to evaluate using ZFS as a backup and archival storage target for those items which do not make sense to store in one of our various enterprise SANs or our hyperconverged environments. The primary goal is capacity, secondarily performance, and finally some semblance of resilience.
The current build is a collection of lab hardware, not necessarily indicative of the final design goal BUT enough to test some of our workload targets:
HP DL380 gen 7 chassis
Dual socket 6c/12t 2.2GHz Xeon v3's
144GB DDR3 1066MHz ECC memory
Dual LSI 9200's (HP rebranded), each with dual 6Gbps Mini-SAS external ports
HP i420 SmartArray card, BBU and caching disabled, 8 x SATA interface (four per controller)
Booting from a mirrored pair of 32Gb Samsung USB3 keys for now
Four HP SAS drive trays, can't remember the model, each one holds Qty 12 x 3.5" SAS drives (each tray connected by one 6Gbps SAS cable)
Qty 48 2TB 7200rpm HP SAS drives strewn about the four trays
Qty 2 Intel S3710 400GB SATA drives, each individual drive configured as a standalone RAID0 device in the SmartArray card, since the controller doesn't support JBOD natively. Thus, both drives are exposed individually, even though I lose SMART. Yeah, I know, it's not indicative of final. It isn't write-caching.
Oh, and running FreeNAS 11.2u1 (the current build as of this writing.)
Built the main pool out of eight RaidZ2 vdevs of 6 drives each (keeping with the recommended 2n + 2 config) and did NOT configure SLOG yet. I kept all the ZFS defaults of record size = 128k, lz4 compression, no dedupe, atime on, sync=default.
From the shell interface of FreeNAS itself (I'm not remote) I ran a simple DD if=/dev/zero of=/mnt/tank/blahddfile bs=4k count=102400
Performance was great, picked up something like 2.8Gbytes/sec. Between the RAM and compressing zeroes being really easy, this isn't a realistic figure. But hey, it's a data point.
Went back to storage config, disabled compression, disabled atime, ran the test again. Now we're seeing 800Mbytes/sec -- which is more inline with what I probably expected to see. Obviously streaming zeros still isn't indicative of anything truly useful, but I'm really just sanity testing for now.
Then I added the pair of s3710's as a mirrored SLOG device and set the pool to sync=always. I re-ran the DD command. And I waited. And I waited.
And I waited.
WTF is it broken? I control-C'd the task, and it barfed out 13 mbytes/sec.
Used the command line to rip out the SLOG from the pool. Used DD to write zeros to the SSD drives, built new partitions, built a new pool with just the SSD drives themselves in a mirrored set, mounted the new SSD-only pool. This new pool is again using the defaults: record size = 128k, sync=default, lz4, atime on, no dedupe. Reran the DD command against the new SSD pool, and BOOM winky: 2.8Gbytes / sec. Yay, RAM and compression.
Turned off compression, turned off atime, ran the test again. 550mbytes/sec.
Forced sync=always. Reran the test. 330mbytes/sec. Let's be honest: these S3700's aren't optanes, 330Mbytes/sec is absolutely OK in my book. And since I'm forcing sync on a pair of mirrored drives without SLOG, I'm incurring double writes on these two drives to get to the 330Mb/sec number. Honestly, this is probably better than I deserve on these drives.
Ok, the drives are good. Maybe it was just a gremlin in the original pool config somehow? I blew away the SSD pool, re-added the drives back into the main TANK pool as mirrored SLOG, retested with sync=always.
Can you guess? Did you guess 13mbytes/sec? I didn't, but that's what it was.
What.
In.
The.
Eff.
Recap: I used the same pair of S3710's as a mirrored pool on their own, with sync forced, and they'll knock down 330mbytes/sec. I literally use the same drives, in the same mirror configuration, but now as a SLOG to a larger pool, and they wallow in at 13mbytes/sec. If I rip out SLOG entirely and test that main pool, it will knock down >800Mbytes/sec.
Something is broken. Thoughts?