Windows SMB performance issues

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

creamelectricart

New Member
Feb 5, 2019
20
4
3
Hi all,

We recently set up a Solaris 11.4 server based on the following hardware;

SuperMicro SSG-6049P-E1CR45L
2x XEON Silver 4114 10 Core 2.2GHz
(LSI 3008 IT mode)
512GB DDR4-2666 ECC RAM
2 x Intel S4600 240GB
2 x Intel Optane 905P 480GB U.2
24 x HGST 10TB SAS
2 x Intel X540T2 10Gbe NIC


Our main storage pool is setup like so;

4 groups of 6 physical 10TB drives in RAID-Z2
1 x Intel Optane 905P 480GB U.2 - cache
1 x Intel Optane 905P 480GB U.2 - log

lz4 compression is enabled on the shared filesystem.


Sharing is via the Solaris CIFS service, sharing to about 32 total clients. Almost all the clients would have only small periods of reading or writing to the server when they save files or do renders, usually files between 500mb - 4GB in size.


The problem we are having is that Windows clients with SMB have random performance - when it works, the Windows SMB clients will copy files at up to a GB/sec over 10Gbe, usually hovering around 500-700MB/s. The problem is is that at frequent and random times, the copy will just completely stall and drop to 0 bytes/s. It will remain stalled often for a few minutes and then resume, ramping up to the previous speed. Sometimes it will stall again.


These stalls only seem to happen on Windows SMB clients and not macOS clients.

The stalls only seem to happen when writing to the server, not reading from it.

Clients are running Windows 10 1709. Solaris version is 11.4.5.3.0.

I can't see any correspondence to anything happening on the server that would seem to cause this.

If anyone has any thoughts about what might be causing this, it would be greatly appreciated. I can't seem to figure out what might cause this behaviour.

Tristan
 

Attachments

m4r1k

Member
Nov 4, 2016
75
8
8
35
Solaris 11.4 is still quite young and many changes were made. Have you tried with the latest SRU6? And also how about oracle support? Maybe it’s something specific in your environment but I would double check with them as well.

Also, you can initially check the pool performance while copying data with a simple zpool iostat -v 1.
If you do that, while I/O hang, do you see like a flush on every disk?
I know it sounds annoying but you should also verify every single step in your chain. Network bandwidth, individual disk I/O, etc.
MacOS smb client is really bad, try with a Linux distro which will use latest SMB version (macOS is still at 3.02).
 

gea

Well-Known Member
Dec 31, 2010
3,140
1,182
113
DE
Basically I would asume that the problem is the write throtteling in Solaris where every write goes first to the rambased writecache and is flushed around every 5s. The size of the cache is as large as the writes in 5s (1GB x 5s = 5GB). So when your write rate goes to zero with intervals of 5s this is propably cache related.

The main question seems now if the overall writes are higher than the pool write capacity. This would declare the writes at full performance to the writecache and zero when the cache is full and must be committed to pool.

Question:
You have an Slog. Have you enabled sync?
This would limit your max possible write performance to around 600 MB/s - even with an Optane as Slog.
Disable sync to try, If this is a pure filer you propably do not want sync. ZFS is always consistent, you need sync mainly fot databases or foreign filesystems (VM storage). For a pure filer the security advantage is quite small.

With 512 GB Ram I would check arcstat for l2arc.
Possibly the L2Arc is not used at all - may even slow down as it requires additional work.

Increasing tcp buffers and Jumboframes helps for overall throughput as well as int-throtteling=off on Windows. You may also update Windows and try the newest Windows nic drivers ex thos from Intel directly. This may jump Windows to 700-900 MB/s
 
  • Like
Reactions: BoredSysadmin

creamelectricart

New Member
Feb 5, 2019
20
4
3
Hi all, thanks for the quick replies.

Solaris 11.4 is still quite young and many changes were made. Have you tried with the latest SRU6? And also how about oracle support? Maybe it’s something specific in your environment but I would double check with them as well.
I just tried updating to 11.4.6.4.0 then and it made no difference unfortunately. We did not see these issues in Solaris 11.3 so maybe this issue is something that they have changed in 11.4.

I am also checking with Oracle support.

Also, you can initially check the pool performance while copying data with a simple zpool iostat -v 1.
If you do that, while I/O hang, do you see like a flush on every disk?
Forgive my ignorance, but how would I see a flush on each disk? I ran the command and can see all the disks in the pool initially writing at a good speed to all drives, then when the copy stalls, the reads/writes all drop off to zero.

MacOS smb client is really bad, try with a Linux distro which will use latest SMB version (macOS is still at 3.02).
Yeah we've learnt that the macOS client is pretty bad, we have a different set of issues with it, but this stalling of file transfers only seems to occur under Windows for some reason, ironically macOS does not have this issue.

Basically I would asume that the problem is the write throtteling in Solaris where every write goes first to the rambased writecache and is flushed around every 5s. The size of the cache is as large as the writes in 5s (1GB x 5s = 5GB). So when your write rate goes to zero with intervals of 5s this is propably cache related.
How would I determine if the two are correlated? At the moment the stalls don't necessarily happen every five seconds, and the stall may last minutes. I am doing testing now where I only have one client writing to the server after a reboot and it's stalling almost immediately after starting the copy.

You have an Slog. Have you enabled sync?
Currently sync is enabled on the filesystem. Would this cause these kinds of stalls? I have never seen this before with 11.3. Overall performance is not really the issue - I can get consistently 1GB/sec transfer speeds on some clients (except of course for when it stalls!).

I tried anyway turning sync off for this filesystem and unfortunately it made no difference.

With 512 GB Ram I would check arcstat for l2arc.
Possibly the L2Arc is not used at all - may even slow down as it requires additional work.
Do you have a link for a version of arcstat that will work with Solaris? The only one I could find was for Illumnos and it doesn't work for Solaris unfortunately. All the old links in the Oracle blogs for the original version are dead unfortunately.

Increasing tcp buffers and Jumboframes helps for overall throughput as well as int-throtteling=off on Windows. You may also update Windows and try the newest Windows nic drivers ex thos from Intel directly. This may jump Windows to 700-900 MB/s
We have the latest drivers from Intel installed on the Windows clients and we're getting very good performance apart from the stalling.


I've just changed nbmand=off for the filesystem and that seems to have fixed the problem. Now getting consistently between 1.01-1.10GB/sec on large file transfers to the server on the fastest clients.

Will wait to see if it comes back again (sometimes it's hard to reproduce and other times it's almost constant) but seems better for now. Thanks for all your help.
 

gea

Well-Known Member
Dec 31, 2010
3,140
1,182
113
DE

creamelectricart

New Member
Feb 5, 2019
20
4
3
I spoke too soon. The problem persists even after turning nbmand=off on the filesystem.

Tried disabling oplocks with;

zfs set share.smb.oplocks=disabled

and that didn't make any difference either.

One thing I have noticed is that it only seems to happen when the client is connected at 10Gb/s. If I force the connection to 1Gb/s then I can't reproduce the issue.

So far I have tried;

Client side
• Disable Windows AutoTune
• Update to latest Intel ethernet drivers
• Force NIC negotiation to 1Gb/s - no stalling. Speeds are obviously greatly reduced but at least consistent

Server side
• Update to latest Solaris SRU
• Disable sync on the filesystem
• Disable nbmand on the filesystem
• Disable oplocks on the share
• Removed the log device

With the exception of downgrading the NIC speed, none of the above has made any impact to the issue.

Observations;

• The issue happens more during heavier server usage (making it harder to reproduce when testing outside of work hours)
• The issue does not occur on macOS 10.13+ connected via smb:// on either 1Gbe or 10Gbe
• The issue only seems to occur on writes to the server, not reading from it
• When one Windows client is stalling, other Windows clients stall in a similar fashion
• Sometimes the stalls result in errors like the one attached
• If Windows client NIC is forced to negotiate at 1Gb/s instead of 10Gb/s the issue is mitigated (however at greatly reduced overall speed)

Did anyone have a link for arcstat that would work on Solaris 11.4?
 

Attachments

Last edited:

creamelectricart

New Member
Feb 5, 2019
20
4
3
Output of 'kstat -pn arcstats'

zfs:0:arcstats:buf_size 1120487808
zfs:0:arcstats:c 473409246464
zfs:0:arcstats:c_max 548324048896
zfs:0:arcstats:c_min 5493977907
zfs:0:arcstats:class misc
zfs:0:arcstats:crtime 2369981.5340904
zfs:0:arcstats:data_freed 6402593728
zfs:0:arcstats:data_size 473603307264
zfs:0:arcstats:ddt_bufs 0
zfs:0:arcstats:ddt_hits 0
zfs:0:arcstats:ddt_lsize 0
zfs:0:arcstats:ddt_misses 0
zfs:0:arcstats:ddt_raw_size 0
zfs:0:arcstats:ddt_size 0
zfs:0:arcstats:demand_data_hits 37744691
zfs:0:arcstats:demand_data_misses 6865861
zfs:0:arcstats:demand_metadata_hits 89363923
zfs:0:arcstats:demand_metadata_misses 1226324
zfs:0:arcstats:evict_l2_cached 59705736704
zfs:0:arcstats:evict_l2_eligible 0
zfs:0:arcstats:evict_l2_ineligible 1373967429632
zfs:0:arcstats:evict_prefetch 1036779520
zfs:0:arcstats:evicted_mfu 373135591424
zfs:0:arcstats:evicted_mru 1060537574912
zfs:0:arcstats:ghosts_deleted 7579711
zfs:0:arcstats:hash_chain_max 17
zfs:0:arcstats:hash_chains 7191564
zfs:0:arcstats:hash_collisions 31602726
zfs:0:arcstats:hash_elements 28842014
zfs:0:arcstats:hash_elements_max 29081381
zfs:0:arcstats:hits 131446408
zfs:0:arcstats:l2_abort_lowmem 0
zfs:0:arcstats:l2_cksum_bad 0
zfs:0:arcstats:l2_feeds 77408
zfs:0:arcstats:l2_hdr_size 1837516976
zfs:0:arcstats:l2_hits 661333
zfs:0:arcstats:l2_imports 0
zfs:0:arcstats:l2_io_error 0
zfs:0:arcstats:l2_misses 963233
zfs:0:arcstats:l2_persistence_hits 976012
zfs:0:arcstats:l2_read_bytes 9653356032
zfs:0:arcstats:l2_rw_clash 0
zfs:0:arcstats:l2_size 166005263872
zfs:0:arcstats:l2_write_bytes 2415584256
zfs:0:arcstats:l2_writes_done 397
zfs:0:arcstats:l2_writes_error 0
zfs:0:arcstats:l2_writes_sent 373
zfs:0:arcstats:lowmem_delay_count 7
zfs:0:arcstats:memory_throttle_count 0
zfs:0:arcstats:meta_limit 0
zfs:0:arcstats:meta_max 7759004560
zfs:0:arcstats:meta_used 4816286224
zfs:0:arcstats:mfu_ghost_hits 12280
zfs:0:arcstats:mfu_hits 119713291
zfs:0:arcstats:misses 8092185
zfs:0:arcstats:mru_ghost_hits 1710212
zfs:0:arcstats:mru_hits 23599542
zfs:0:arcstats:mutex_miss 33012
zfs:0:arcstats:eek:ther_size 1604943408
zfs:0:arcstats:p 138287919927
zfs:0:arcstats:prefetch_behind_prefetch 342737
zfs:0:arcstats:prefetch_data_hits 4213357
zfs:0:arcstats:prefetch_joins 446380
zfs:0:arcstats:prefetch_meta_size 33565104
zfs:0:arcstats:prefetch_metadata_hits 124437
zfs:0:arcstats:prefetch_reads 4667150
zfs:0:arcstats:prefetch_size 366870528
zfs:0:arcstats:rawdata_size 219772928
zfs:0:arcstats:size 473603307264
zfs:0:arcstats:snaptime 2447075.1182666
 

gea

Well-Known Member
Dec 31, 2010
3,140
1,182
113
DE
Just to decide if its a network/other or a pool problem.
Can you create a pool on the Optane 905 and check if behaviour is the same.

If its ok there, check iostat during writes. Look if all disks behave similar regarding busy and wait. A single weak disk may also result in such a behaviour.
 

creamelectricart

New Member
Feb 5, 2019
20
4
3
Hi gea, thanks for all your help and suggestions so far.

Just to decide if its a network/other or a pool problem.
Can you create a pool on the Optane 905 and check if behaviour is the same.
I just created a pool on one of the Optane drives and shared it and I'm still having the same problem with that one drive. Would that suggest it's a network problem?
 

creamelectricart

New Member
Feb 5, 2019
20
4
3
I ran snoop on the server when the copy problem was present - there are lots of 'Unknown Length' messages - not sure if this is an issue or not;

548 0.00001 192.168.1.131 -> 192.168.1.2 NBT Type=SESSION MESSAGE Length=1456
549 0.00001 192.168.1.131 -> 192.168.1.2 NBT Type=Unknown Length=1456
550 0.00001 192.168.1.2 -> 192.168.1.131 SMB R port=55748
551 0.00005 192.168.1.131 -> 192.168.1.2 NBT Type=SESSION MESSAGE Length=1456
552 0.00001 192.168.1.131 -> 192.168.1.2 NBT Type=Unknown Length=1456
553 0.00001 192.168.1.131 -> 192.168.1.2 NBT Type=Unknown Length=1456
554 0.00002 192.168.1.131 -> 192.168.1.2 NBT Type=Unknown Length=1456
555 0.00001 192.168.1.131 -> 192.168.1.2 NBT Type=Unknown Length=1456

I also used tcpdump and compared a problem free copy to the server with the stalled copy - the main difference was the presence of several lines like this in the stalled copy;

10:56:49.811973 IP saltgum.microsoft-ds > tr2990wx-01.xxxxx.net.56853: Flags [.], ack 775249932, win 32804, options [nop,nop,sack 1 {775251392:775252852}], length 0
 

creamelectricart

New Member
Feb 5, 2019
20
4
3
Hi gea, limiting the procotol sounds like a good idea. I've been looking through the documentation but have yet been able to find a way to do this unfortunately.

I've already opened a service request with Oracle and provided their required info, hoping to hear back from them soon.

Thanks for your help.
 

gea

Well-Known Member
Dec 31, 2010
3,140
1,182
113
DE
sharectl get smb
shows all smb properties that you can set via sharectrl

(like server_maxprotocol=3)
see also napp-it menu Services > SMB > Properties
 
  • Like
Reactions: BoredSysadmin

creamelectricart

New Member
Feb 5, 2019
20
4
3
So I tried all of the following with no success unfortunately;

• Turn off oplocks with - sharectl set -p oplock_enable=false smb
• Turn off multichannel with - sharectl set -p multichannel_enable=false smb
• Turn off ipv6 with - sharectl set -p ipv6_enable=false smb
• Try SMB 2 - sharectl set -p server_maxprotocol=2 smb
• Try SMB 1 - sharectl set -p server_maxprotocol=1 smb

I can easily reproduce this issue every time - I simply start a large sequential file copy on one Windows client to the server, then as soon as I start a copy from a different client, both clients stall. So basically the server can only seem to handle one SMB client at a time.
 

creamelectricart

New Member
Feb 5, 2019
20
4
3
Here you can see copy performance from two different clients when copying individually;

SingleMachineCopy.mov

And then here you can see what happens to the performance when a copy is started on one client, and then as soon as another copy is started on a different client, both clients stall. A copy that should take about 5 seconds takes 10 minutes instead.

TwoMachineCopy.mov

(the second link takes quite a while for Jumpshare to process - it is probably easier to download it to view).
 

creamelectricart

New Member
Feb 5, 2019
20
4
3
I was able to reproduce this issue over NFS. I can reproduce this over NFS and SMB in Windows 10. I cannot reproduce the issue over SFTP.
 

creamelectricart

New Member
Feb 5, 2019
20
4
3
Just to update. I did a test install of Solaris 11.3 on the same server, and set up a test pool using an Intel Optane device. I was unable to reproduce the issue under 11.3.

Single device pool

Intel Optane 905P 480GB (Read 2600MB/s Write 2200MB/s 550000 IOPS)

compression=lz4


Solaris 11.3.1.5.0

Copy from server - ~700MB/s
Copy to server - ~370MB/s

--> No stalling when copying to the server from multiple clients



Solaris 11.3.35.6.0

Copy from server - ~670MB/s
Copy to server - ~820MB/s

-->No stalling when copying to the server from multiple clients



Solaris 11.4

Copy from server - ~1000MB/s
Copy to server - ~1000MB/s

--> Stalling when copying to the server from multiple clients. Performance slows to a crawl. Copies take up to 132 times longer than expected.
 
Last edited: