Windows SMB performance issues

creamelectricart · Feb 25, 2019

Hi all,

We recently set up a Solaris 11.4 server based on the following hardware;

SuperMicro SSG-6049P-E1CR45L
2x XEON Silver 4114 10 Core 2.2GHz
(LSI 3008 IT mode)
512GB DDR4-2666 ECC RAM
2 x Intel S4600 240GB
2 x Intel Optane 905P 480GB U.2
24 x HGST 10TB SAS
2 x Intel X540T2 10Gbe NIC

Our main storage pool is setup like so;

4 groups of 6 physical 10TB drives in RAID-Z2
1 x Intel Optane 905P 480GB U.2 - cache
1 x Intel Optane 905P 480GB U.2 - log

lz4 compression is enabled on the shared filesystem.

Sharing is via the Solaris CIFS service, sharing to about 32 total clients. Almost all the clients would have only small periods of reading or writing to the server when they save files or do renders, usually files between 500mb - 4GB in size.

The problem we are having is that Windows clients with SMB have random performance - when it works, the Windows SMB clients will copy files at up to a GB/sec over 10Gbe, usually hovering around 500-700MB/s. The problem is is that at frequent and random times, the copy will just completely stall and drop to 0 bytes/s. It will remain stalled often for a few minutes and then resume, ramping up to the previous speed. Sometimes it will stall again.

These stalls only seem to happen on Windows SMB clients and not macOS clients.

The stalls only seem to happen when writing to the server, not reading from it.

Clients are running Windows 10 1709. Solaris version is 11.4.5.3.0.

I can't see any correspondence to anything happening on the server that would seem to cause this.

If anyone has any thoughts about what might be causing this, it would be greatly appreciated. I can't seem to figure out what might cause this behaviour.

Tristan

acquacow · Feb 25, 2019

I had similar issues once and bookmarked these two pages in relation to it. I did both and my problems went away, not sure which really was the fix:

A word about AutoTuningLevel – TCP Receive Window Auto-Tuning Level explained

Fix: Slow Internet After Windows 10 Creators Update

I also maxed all my TCP buffers in my NIC drivers and enabled jumbo frames on the nics and my switch ports.

m4r1k · Feb 25, 2019

Solaris 11.4 is still quite young and many changes were made. Have you tried with the latest SRU6? And also how about oracle support? Maybe it’s something specific in your environment but I would double check with them as well.

Also, you can initially check the pool performance while copying data with a simple zpool iostat -v 1.
If you do that, while I/O hang, do you see like a flush on every disk?
I know it sounds annoying but you should also verify every single step in your chain. Network bandwidth, individual disk I/O, etc.
MacOS smb client is really bad, try with a Linux distro which will use latest SMB version (macOS is still at 3.02).

gea · Feb 26, 2019

Basically I would asume that the problem is the write throtteling in Solaris where every write goes first to the rambased writecache and is flushed around every 5s. The size of the cache is as large as the writes in 5s (1GB x 5s = 5GB). So when your write rate goes to zero with intervals of 5s this is propably cache related.

The main question seems now if the overall writes are higher than the pool write capacity. This would declare the writes at full performance to the writecache and zero when the cache is full and must be committed to pool.

Question:
You have an Slog. Have you enabled sync?
This would limit your max possible write performance to around 600 MB/s - even with an Optane as Slog.
Disable sync to try, If this is a pure filer you propably do not want sync. ZFS is always consistent, you need sync mainly fot databases or foreign filesystems (VM storage). For a pure filer the security advantage is quite small.

With 512 GB Ram I would check arcstat for l2arc.
Possibly the L2Arc is not used at all - may even slow down as it requires additional work.

Increasing tcp buffers and Jumboframes helps for overall throughput as well as int-throtteling=off on Windows. You may also update Windows and try the newest Windows nic drivers ex thos from Intel directly. This may jump Windows to 700-900 MB/s

creamelectricart · Feb 26, 2019

Hi all, thanks for the quick replies.

m4r1k said:
Solaris 11.4 is still quite young and many changes were made. Have you tried with the latest SRU6? And also how about oracle support? Maybe it’s something specific in your environment but I would double check with them as well.

I just tried updating to 11.4.6.4.0 then and it made no difference unfortunately. We did not see these issues in Solaris 11.3 so maybe this issue is something that they have changed in 11.4.

I am also checking with Oracle support.

m4r1k said:
Also, you can initially check the pool performance while copying data with a simple zpool iostat -v 1.
If you do that, while I/O hang, do you see like a flush on every disk?

Forgive my ignorance, but how would I see a flush on each disk? I ran the command and can see all the disks in the pool initially writing at a good speed to all drives, then when the copy stalls, the reads/writes all drop off to zero.

m4r1k said:
MacOS smb client is really bad, try with a Linux distro which will use latest SMB version (macOS is still at 3.02).

Yeah we've learnt that the macOS client is pretty bad, we have a different set of issues with it, but this stalling of file transfers only seems to occur under Windows for some reason, ironically macOS does not have this issue.

gea said:
Basically I would asume that the problem is the write throtteling in Solaris where every write goes first to the rambased writecache and is flushed around every 5s. The size of the cache is as large as the writes in 5s (1GB x 5s = 5GB). So when your write rate goes to zero with intervals of 5s this is propably cache related.

How would I determine if the two are correlated? At the moment the stalls don't necessarily happen every five seconds, and the stall may last minutes. I am doing testing now where I only have one client writing to the server after a reboot and it's stalling almost immediately after starting the copy.

gea said:
You have an Slog. Have you enabled sync?

Currently sync is enabled on the filesystem. Would this cause these kinds of stalls? I have never seen this before with 11.3. Overall performance is not really the issue - I can get consistently 1GB/sec transfer speeds on some clients (except of course for when it stalls!).

I tried anyway turning sync off for this filesystem and unfortunately it made no difference.

gea said:
With 512 GB Ram I would check arcstat for l2arc.
Possibly the L2Arc is not used at all - may even slow down as it requires additional work.

Do you have a link for a version of arcstat that will work with Solaris? The only one I could find was for Illumnos and it doesn't work for Solaris unfortunately. All the old links in the Oracle blogs for the original version are dead unfortunately.

gea said:
Increasing tcp buffers and Jumboframes helps for overall throughput as well as int-throtteling=off on Windows. You may also update Windows and try the newest Windows nic drivers ex thos from Intel directly. This may jump Windows to 700-900 MB/s

We have the latest drivers from Intel installed on the Windows clients and we're getting very good performance apart from the stalling.

I've just changed nbmand=off for the filesystem and that seems to have fixed the problem. Now getting consistently between 1.01-1.10GB/sec on large file transfers to the server on the fastest clients.

Will wait to see if it comes back again (sometimes it's hard to reproduce and other times it's almost constant) but seems better for now. Thanks for all your help.

gea · Feb 26, 2019

Another locking related problem may be the oplock behaviour
see OmniOS/ZFS/Windows 7: "Save as" from within applications lags 5 seconds for all file sizes over CIFS/SMB
SMB Shares - Managing SMB File Sharing and Windows Interoperability in Oracle® Solaris 11.4

I have not tried in Solaris yet (setting is propably different to OmniOS) but this may describe the difference between OSX and Windows especially when you use SMB1 on OSX (via cifs://solaris)

creamelectricart · Feb 26, 2019

I spoke too soon. The problem persists even after turning nbmand=off on the filesystem.

Tried disabling oplocks with;

zfs set share.smb.oplocks=disabled

and that didn't make any difference either.

One thing I have noticed is that it only seems to happen when the client is connected at 10Gb/s. If I force the connection to 1Gb/s then I can't reproduce the issue.

So far I have tried;

Client side
• Disable Windows AutoTune
• Update to latest Intel ethernet drivers
• Force NIC negotiation to 1Gb/s - no stalling. Speeds are obviously greatly reduced but at least consistent

Server side
• Update to latest Solaris SRU
• Disable sync on the filesystem
• Disable nbmand on the filesystem
• Disable oplocks on the share
• Removed the log device

With the exception of downgrading the NIC speed, none of the above has made any impact to the issue.

Observations;

• The issue happens more during heavier server usage (making it harder to reproduce when testing outside of work hours)
• The issue does not occur on macOS 10.13+ connected via smb:// on either 1Gbe or 10Gbe
• The issue only seems to occur on writes to the server, not reading from it
• When one Windows client is stalling, other Windows clients stall in a similar fashion
• Sometimes the stalls result in errors like the one attached
• If Windows client NIC is forced to negotiate at 1Gb/s instead of 10Gb/s the issue is mitigated (however at greatly reduced overall speed)

Did anyone have a link for arcstat that would work on Solaris 11.4?

creamelectricart · Feb 26, 2019

I also see several entries like this in /var/adm/messages;

WARNING: smb_kdoor_chkhdr[async_response]: call failed: 3

creamelectricart · Feb 26, 2019

Output of 'kstat -pn arcstats'

zfs:0:arcstats:buf_size 1120487808
zfs:0:arcstats:c 473409246464
zfs:0:arcstats:c_max 548324048896
zfs:0:arcstats:c_min 5493977907
zfs:0:arcstats:class misc
zfs:0:arcstats:crtime 2369981.5340904
zfs:0:arcstats:data_freed 6402593728
zfs:0:arcstats:data_size 473603307264
zfs:0:arcstats:ddt_bufs 0
zfs:0:arcstats:ddt_hits 0
zfs:0:arcstats:ddt_lsize 0
zfs:0:arcstats:ddt_misses 0
zfs:0:arcstats:ddt_raw_size 0
zfs:0:arcstats:ddt_size 0
zfs:0:arcstats:demand_data_hits 37744691
zfs:0:arcstats:demand_data_misses 6865861
zfs:0:arcstats:demand_metadata_hits 89363923
zfs:0:arcstats:demand_metadata_misses 1226324
zfs:0:arcstats:evict_l2_cached 59705736704
zfs:0:arcstats:evict_l2_eligible 0
zfs:0:arcstats:evict_l2_ineligible 1373967429632
zfs:0:arcstats:evict_prefetch 1036779520
zfs:0:arcstats:evicted_mfu 373135591424
zfs:0:arcstats:evicted_mru 1060537574912
zfs:0:arcstats:ghosts_deleted 7579711
zfs:0:arcstats:hash_chain_max 17
zfs:0:arcstats:hash_chains 7191564
zfs:0:arcstats:hash_collisions 31602726
zfs:0:arcstats:hash_elements 28842014
zfs:0:arcstats:hash_elements_max 29081381
zfs:0:arcstats:hits 131446408
zfs:0:arcstats:l2_abort_lowmem 0
zfs:0:arcstats:l2_cksum_bad 0
zfs:0:arcstats:l2_feeds 77408
zfs:0:arcstats:l2_hdr_size 1837516976
zfs:0:arcstats:l2_hits 661333
zfs:0:arcstats:l2_imports 0
zfs:0:arcstats:l2_io_error 0
zfs:0:arcstats:l2_misses 963233
zfs:0:arcstats:l2_persistence_hits 976012
zfs:0:arcstats:l2_read_bytes 9653356032
zfs:0:arcstats:l2_rw_clash 0
zfs:0:arcstats:l2_size 166005263872
zfs:0:arcstats:l2_write_bytes 2415584256
zfs:0:arcstats:l2_writes_done 397
zfs:0:arcstats:l2_writes_error 0
zfs:0:arcstats:l2_writes_sent 373
zfs:0:arcstats:lowmem_delay_count 7
zfs:0:arcstats:memory_throttle_count 0
zfs:0:arcstats:meta_limit 0
zfs:0:arcstats:meta_max 7759004560
zfs:0:arcstats:meta_used 4816286224
zfs:0:arcstats:mfu_ghost_hits 12280
zfs:0:arcstats:mfu_hits 119713291
zfs:0:arcstats:misses 8092185
zfs:0:arcstats:mru_ghost_hits 1710212
zfs:0:arcstats:mru_hits 23599542
zfs:0:arcstats:mutex_miss 33012
zfs:0:arcstats

ther_size 1604943408
zfs:0:arcstats

138287919927
zfs:0:arcstats

refetch_behind_prefetch 342737
zfs:0:arcstats

refetch_data_hits 4213357
zfs:0:arcstats

refetch_joins 446380
zfs:0:arcstats

refetch_meta_size 33565104
zfs:0:arcstats

refetch_metadata_hits 124437
zfs:0:arcstats

refetch_reads 4667150
zfs:0:arcstats

refetch_size 366870528
zfs:0:arcstats:rawdata_size 219772928
zfs:0:arcstats:size 473603307264
zfs:0:arcstats:snaptime 2447075.1182666

gea · Feb 27, 2019

Just to decide if its a network/other or a pool problem.
Can you create a pool on the Optane 905 and check if behaviour is the same.

If its ok there, check iostat during writes. Look if all disks behave similar regarding busy and wait. A single weak disk may also result in such a behaviour.

creamelectricart · Feb 27, 2019

Hi gea, thanks for all your help and suggestions so far.

gea said:
Just to decide if its a network/other or a pool problem.
Can you create a pool on the Optane 905 and check if behaviour is the same.

I just created a pool on one of the Optane drives and shared it and I'm still having the same problem with that one drive. Would that suggest it's a network problem?

creamelectricart · Feb 27, 2019

I ran snoop on the server when the copy problem was present - there are lots of 'Unknown Length' messages - not sure if this is an issue or not;

548 0.00001 192.168.1.131 -> 192.168.1.2 NBT Type=SESSION MESSAGE Length=1456
549 0.00001 192.168.1.131 -> 192.168.1.2 NBT Type=Unknown Length=1456
550 0.00001 192.168.1.2 -> 192.168.1.131 SMB R port=55748
551 0.00005 192.168.1.131 -> 192.168.1.2 NBT Type=SESSION MESSAGE Length=1456
552 0.00001 192.168.1.131 -> 192.168.1.2 NBT Type=Unknown Length=1456
553 0.00001 192.168.1.131 -> 192.168.1.2 NBT Type=Unknown Length=1456
554 0.00002 192.168.1.131 -> 192.168.1.2 NBT Type=Unknown Length=1456
555 0.00001 192.168.1.131 -> 192.168.1.2 NBT Type=Unknown Length=1456

I also used tcpdump and compared a problem free copy to the server with the stalled copy - the main difference was the presence of several lines like this in the stalled copy;

10:56:49.811973 IP saltgum.microsoft-ds > tr2990wx-01.xxxxx.net.56853: Flags [.], ack 775249932, win 32804, options [nop,nop,sack 1 {775251392:775252852}], length 0

gea · Feb 27, 2019

At this stage, it may be a support case with Oracle (maybe your thread on the Oracle forum gives an answer from Oracle devs, Windows SMB client performance issues | Oracle Community).

Maybe you can limit SMB to protocol from 3.1 to 2 or 1 to check if it is related to the new smb 3.1

creamelectricart · Feb 28, 2019

Hi gea, limiting the procotol sounds like a good idea. I've been looking through the documentation but have yet been able to find a way to do this unfortunately.

I've already opened a service request with Oracle and provided their required info, hoping to hear back from them soon.

Thanks for your help.

gea · Feb 28, 2019

sharectl get smb
shows all smb properties that you can set via sharectrl

(like server_maxprotocol=3)
see also napp-it menu Services > SMB > Properties

creamelectricart · Mar 1, 2019

Thanks gea, that is super useful. I got lost in a maze of Oracle documents and could not find this. Thanks again!

creamelectricart · Mar 1, 2019

So I tried all of the following with no success unfortunately;

• Turn off oplocks with - sharectl set -p oplock_enable=false smb
• Turn off multichannel with - sharectl set -p multichannel_enable=false smb
• Turn off ipv6 with - sharectl set -p ipv6_enable=false smb
• Try SMB 2 - sharectl set -p server_maxprotocol=2 smb
• Try SMB 1 - sharectl set -p server_maxprotocol=1 smb

I can easily reproduce this issue every time - I simply start a large sequential file copy on one Windows client to the server, then as soon as I start a copy from a different client, both clients stall. So basically the server can only seem to handle one SMB client at a time.

creamelectricart · Mar 1, 2019

Here you can see copy performance from two different clients when copying individually;

SingleMachineCopy.mov

And then here you can see what happens to the performance when a copy is started on one client, and then as soon as another copy is started on a different client, both clients stall. A copy that should take about 5 seconds takes 10 minutes instead.

TwoMachineCopy.mov

(the second link takes quite a while for Jumpshare to process - it is probably easier to download it to view).

creamelectricart · Mar 4, 2019

I was able to reproduce this issue over NFS. I can reproduce this over NFS and SMB in Windows 10. I cannot reproduce the issue over SFTP.

creamelectricart · Mar 11, 2019

Just to update. I did a test install of Solaris 11.3 on the same server, and set up a test pool using an Intel Optane device. I was unable to reproduce the issue under 11.3.

Single device pool

Intel Optane 905P 480GB (Read 2600MB/s Write 2200MB/s 550000 IOPS)

compression=lz4

Solaris 11.3.1.5.0

Copy from server - ~700MB/s
Copy to server - ~370MB/s

--> No stalling when copying to the server from multiple clients

Solaris 11.3.35.6.0

Copy from server - ~670MB/s
Copy to server - ~820MB/s

-->No stalling when copying to the server from multiple clients

Solaris 11.4

Copy from server - ~1000MB/s
Copy to server - ~1000MB/s

--> Stalling when copying to the server from multiple clients. Performance slows to a crawl. Copies take up to 132 times longer than expected.

Windows SMB performance issues

New Member

Attachments

Well-Known Member

Member

Well-Known Member

New Member

Well-Known Member

New Member

Attachments

New Member

New Member

Well-Known Member

New Member

New Member

Well-Known Member

New Member

Well-Known Member

New Member

New Member

New Member

New Member

New Member