Hi,
I'm wondering why I got very slow read speed for large files (basically file larger than my RAM), compared to top notch async write speed.
Here is my setup:
Up to date Omnios 151028
Supermicro X10SRL-F board with a Xeon 1620 v3
32GB of DDR4 ECC
Controller LSI 3008
12 HGST DC HC510 Sata (10TB drives) configured in one z2 pool with record size set to 256K
Basically, as soon as I read a file which cannot fit in the ARC cache, I got very slow read speed, whereas my async write speed is very good.
On an AJA Speed test over a 10Gb connection, with 16GB file test, I got 950MB/s in write and 900MB/s in read.
If I set the filesize to 64GB, I still got 950MB/s in write but only around 100MB/s in read.
I also got the same result doing large files copy through Windows explorer.
It's the exact same effect if I disable the cache when doing benchmarks:
Bennchmark filesystem: /hdd12z2/_Pool_Benchmark
Read: filebench+dd, Write: filebench_sequential, date: 11.23.2018
time dd if=/dev/zero of=/hdd12z2/_Pool_Benchmark/syncwrite.tst bs=500000000 count=10
5000000000 bytes transferred in 3.431949 secs (1456898318 bytes/sec)
hostname XST24BA Memory size: 32661 Megabytes
pool hdd12z2 (recsize=256k, compr=off, readcache=none)
slog -
remark
Fb3 sync=always sync=disabled
Fb4 singlestreamwrite.f sync=always sync=disabled
246 ops 7426 ops
49.197 ops/s 1483.963 ops/s
10802us cpu/op 2042us cpu/op
20.2ms latency 0.7ms latency
49.0 MB/s 1483.8 MB/s
____________________________________________________________________________
read fb 7-9 + dd (opt) randomread.f randomrw.f singlestreamr dd
pri/sec cache=none 0.4 MB/s 0.8 MB/s 81.2 MB/s 119.9 MB/s
____________________________________________________________________________
If I set the record size to 1M, I got a small a little over 200MB/s in read, but it's still far less than expected.
Bennchmark filesystem: /hdd12z2/_Pool_Benchmark
Read: filebench+dd, Write: filebench_sequential, date: 11.23.2018
time dd if=/dev/zero of=/hdd12z2/_Pool_Benchmark/syncwrite.tst bs=500000000 count=10
5000000000 bytes transferred in 1.931137 secs (2589148332 bytes/sec)
hostname XST24BA Memory size: 32661 Megabytes
pool hdd12z2 (recsize=1M, compr=off, readcache=none)
slog -
remark
Fb3 sync=always sync=disabled
Fb4 singlestreamwrite.f sync=always sync=disabled
212 ops 9971 ops
42.397 ops/s 1994.163 ops/s
12158us cpu/op 1552us cpu/op
23.4ms latency 0.5ms latency
42.2 MB/s 1994.0 MB/s
__________________________________________________________________________
read fb 7-9 + dd (opt) randomread.f randomrw.f singlestreamr dd
pri/sec cache=none 0.4 MB/s 0.8 MB/s 200.0 MB/s 243.9 MB/s
__________________________________________________________________________
While I can somehow understand the result when the cache is disabled, why do I have the same result when the cache is enable, but only with files bigger than my RAM size ?
What's more strange for me is that it's slow even at the beginning of the copy/AJA read test, as if the cache was never used.
I used to work with XFS shares through SMB, using a dedicated LSI MegaRAID card with BBU, and on which I never had such disparities in read and write performances using any file size.
I'm quite new to ZFS, so I'm trying to understand its strengths (and there are plenty!) and limits.
I'm wondering why I got very slow read speed for large files (basically file larger than my RAM), compared to top notch async write speed.
Here is my setup:
Up to date Omnios 151028
Supermicro X10SRL-F board with a Xeon 1620 v3
32GB of DDR4 ECC
Controller LSI 3008
12 HGST DC HC510 Sata (10TB drives) configured in one z2 pool with record size set to 256K
Basically, as soon as I read a file which cannot fit in the ARC cache, I got very slow read speed, whereas my async write speed is very good.
On an AJA Speed test over a 10Gb connection, with 16GB file test, I got 950MB/s in write and 900MB/s in read.
If I set the filesize to 64GB, I still got 950MB/s in write but only around 100MB/s in read.
I also got the same result doing large files copy through Windows explorer.
It's the exact same effect if I disable the cache when doing benchmarks:
Bennchmark filesystem: /hdd12z2/_Pool_Benchmark
Read: filebench+dd, Write: filebench_sequential, date: 11.23.2018
time dd if=/dev/zero of=/hdd12z2/_Pool_Benchmark/syncwrite.tst bs=500000000 count=10
5000000000 bytes transferred in 3.431949 secs (1456898318 bytes/sec)
hostname XST24BA Memory size: 32661 Megabytes
pool hdd12z2 (recsize=256k, compr=off, readcache=none)
slog -
remark
Fb3 sync=always sync=disabled
Fb4 singlestreamwrite.f sync=always sync=disabled
246 ops 7426 ops
49.197 ops/s 1483.963 ops/s
10802us cpu/op 2042us cpu/op
20.2ms latency 0.7ms latency
49.0 MB/s 1483.8 MB/s
____________________________________________________________________________
read fb 7-9 + dd (opt) randomread.f randomrw.f singlestreamr dd
pri/sec cache=none 0.4 MB/s 0.8 MB/s 81.2 MB/s 119.9 MB/s
____________________________________________________________________________
If I set the record size to 1M, I got a small a little over 200MB/s in read, but it's still far less than expected.
Bennchmark filesystem: /hdd12z2/_Pool_Benchmark
Read: filebench+dd, Write: filebench_sequential, date: 11.23.2018
time dd if=/dev/zero of=/hdd12z2/_Pool_Benchmark/syncwrite.tst bs=500000000 count=10
5000000000 bytes transferred in 1.931137 secs (2589148332 bytes/sec)
hostname XST24BA Memory size: 32661 Megabytes
pool hdd12z2 (recsize=1M, compr=off, readcache=none)
slog -
remark
Fb3 sync=always sync=disabled
Fb4 singlestreamwrite.f sync=always sync=disabled
212 ops 9971 ops
42.397 ops/s 1994.163 ops/s
12158us cpu/op 1552us cpu/op
23.4ms latency 0.5ms latency
42.2 MB/s 1994.0 MB/s
__________________________________________________________________________
read fb 7-9 + dd (opt) randomread.f randomrw.f singlestreamr dd
pri/sec cache=none 0.4 MB/s 0.8 MB/s 200.0 MB/s 243.9 MB/s
__________________________________________________________________________
While I can somehow understand the result when the cache is disabled, why do I have the same result when the cache is enable, but only with files bigger than my RAM size ?
What's more strange for me is that it's slow even at the beginning of the copy/AJA read test, as if the cache was never used.
I used to work with XFS shares through SMB, using a dedicated LSI MegaRAID card with BBU, and on which I never had such disparities in read and write performances using any file size.
I'm quite new to ZFS, so I'm trying to understand its strengths (and there are plenty!) and limits.