SMB issues with osx - help request

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

dragonme

Active Member
Apr 12, 2016
282
25
28
Another weird issue with napp-it and mac

I have several zfs file systems in a esxi all in one, some for media shares.. all has been working well

I created a new folder in one of the SMB shares, and now I can seem to SMB from my Mac and move files over to that file system

I can still transfer fine to the others.

I have tried to reset ACL and @everyone, tired setting permissions to match my other zfs file systems that are still working fine.. no luck

only way I can get it to. work is by connecting with CIFS from Mac instead of SMB for that share, which connects as SMB1 instead of 2.1

I can connect to the share, click into the subfolder, but as soon as I make a change, try to move a file, it just beachballs and hangs the Mac.. I have to either reset the SMB server or turn off sharing on that filesystem to break the hang.

is there any way to remove a corrupted 'sharing' config in the .zfs folder if something there got jacked up? I have also tried other folders on that share, no luck, so it looks like something got screwed up on the kennel SMB config..

again.. stumped...
 

gea

Well-Known Member
Dec 31, 2010
3,163
1,195
113
DE
Can you add some informations like OSX and OmniOS release, ZFS settings like aclinherit, nbmand or SMB settings (Services > SMB > Properties)

btw
There are no config files like with SAMBA and smb.conf. All relevant SMB settings of the Solaris kernelbased SMB server are done via sharectl like oplock, encrypt, signing or smb version (napp-it menu SMB > Services> Properties) or are pure ZFS properties like the shared folder itself, aclmode, aclinherit, nbmand or are ACL settings. The only file in .zfs/shares with settings is the share control file. ACL settings on this file reflect the SMB share ACL.

SMB is usually 3.1.1 not 2.1

You can advertize SMB/ Timemachine services for Bonjour capable clients like OSX in menu Services - 'Bonjour and Autostart'
On SMB locking problems switch ZFS setting nbmand or SMB oplock setting (switch back when not successful)

You can disable Apple OSX SMB extensions (kernelbased Illumos SMB server) in /etc/system (reboot required):
set smbsrv:smb2_aapl_use_file_ids=1 # locking problems ex in Avid Media Composer
set smbsrv:smb2_aapl_extensions=0 # disable all Apple extensions
 
Last edited:
  • Like
Reactions: BoredSysadmin

dragonme

Active Member
Apr 12, 2016
282
25
28
I felt like the release info was not important as all the other files systems on the same server are working in multiple pools

aclinherit and aclmode both passthrough, tried discard, no change back to passthrough like the other file systems
nbmand off as its only shared by SMB, guest and that other setting not checked.
smb properties all stock except I have forced signing off - that generally helps osx with issues and performance and its a small home network so risk low
I am using an older omnios as it was working fine and later releases seem to be more issue/bug prone hence smb 2.1
bonjour is on, but it was working fine with it off too... later osx does not use AFP so never bothered to go down that road, all shares, SMB

all the osx specific tuning I know about, but again, other files systems are working fine. only having an issue with one file system.. again I am thinking that somehow the ACL got corrupted or that .zfs/share file got corrupted ...

I can connect to the share from osx, but once I make a change to a file name or try to copy a file in, it just locks up, beachball the Mac and only way to stop the beachball, is to turn off the smb share on napp-it server, or restart smb on the nappit server..

so, how to delete of fully reset all info / ACL on that file system, delete .zfs/share?
what if a corrupt oplock or some other kind of flag? I am at wits end.. I have not tired file transfer to that file system from a windows client.
 

gea

Well-Known Member
Dec 31, 2010
3,163
1,195
113
DE
I would set nbmand to on (default ZFS property). Only netatalk (EoL, old Apple sharing protocol) required off.

In SMB properties you should disable encrypt and signing.
You can try to switch oplock setting, What is OpLock (opportunistic lock)? | Definition from TechTarget

You cannot delete in .zfs. The share control file is deleted automatically when you disable a share and recreated when you enable a share with the default everyone@=full. Current napp-it remembers and can re-enable former share acl settings. To reset all filebased ACL recursively use "reset ACL" (menu ZFS filesystem > ACL on folders, select a filesystem/share)

A problem with Windows and OSX up from v11 and older OmniOS is 256bit SMB encryption. Current OmniOS supports 256bit encryption and fixes some other OSX sharing problems (among many other ongoing security and bug fixes).

I would suggest to update to current OmniOS 141046 lts. Due bootenvironments you can go back any time.
 

dragonme

Active Member
Apr 12, 2016
282
25
28
to me it looks like the permissions on that particular file system are all jacked up... either the gui did something unexpected, or something.

I created a new test file system, shared it, mounted the smb share, connected to if from Mac, created a folder and subfolder using Finder, and copied files into it no problem.

only question is where are the permission and ACL issues, the zfs file system level, folder level.. ACL and permissions with nappit free sucks since both hands are tied behind your back

and I think that doing a combination of manual and nappit gui is what screws things up

this is a fairly large file system so copying everything over to the new test file system, deleting the old one and renaming the test to the old.. would be time consuming, plus I would loose snapshots and backup job so I would have to delete that filesystem on the backup array and start from scratch..

I am considering promoting the last filesystem snapshot but honestly have never had to do that so I would have to read up..
 

dragonme

Active Member
Apr 12, 2016
282
25
28
spoke too soon.. I drug to files into the share, first one copied quickly and finished, Hung again on the second..
something is definitely jacked.. I don't create new filesystems often, just using my existing pools and filesystems .. so this may be a regression in nappit gui 21.06a10
 

dragonme

Active Member
Apr 12, 2016
282
25
28
Code:
system_comment=
max_workers=1024
netbios_enable=false
netbios_scope=
lmauth_level=4
keep_alive=5400
wins_server_1=
wins_server_2=
wins_exclude=
signing_enabled=true
signing_required=false
restrict_anonymous=false
pdc=
ads_site=
ddns_enable=false
autohome_map=/etc
ipv6_enable=false
print_enable=false
traverse_mounts=true
map=
unmap=
disposition=
max_protocol=
 

dragonme

Active Member
Apr 12, 2016
282
25
28
moving to OmniOS 141046 would require me to set up an all new VM.. my current nappit to old to update

which I may have to do, but if the corruption is in the ACL/Sharing that resided on the POOL/FILESYSTEM that won't fix anything as the pool import will just bring back the same corruption
 

gea

Well-Known Member
Dec 31, 2010
3,163
1,195
113
DE
moving to OmniOS 141046 would require me to set up an all new VM.. my current nappit to old to update

which I may have to do, but if the corruption is in the ACL/Sharing that resided on the POOL/FILESYSTEM that won't fix anything as the pool import will just bring back the same corruption
What is your OmniOS release?
Update is ultra easy with OmniOS up from 151038 and every step can be undone when booting into a prior bootenvironment
Login via Putty as root and copy/paste commands (mouse right click)
Code:
pkg unset-publisher omnios
pkg unset-publisher extra.omnios

pkg set-publisher -g https://pkg.omnios.org/r151046/core omnios
pkg set-publisher -g https://pkg.omnios.org/r151046/extra extra.omnios

pkg update pkg
pkg update

reboot

if your OmniOS is prior 151038, you must update in steps over following lts releases (ex 151030, 151038),
use same commands with lts release number instead 151046

If your napp-it is too old (gives errors on menu users), update via about > update at least to 21.06 free

A unnoticed file corruption on ZFS is extremely unlikely.
I would first update and check other items later if the problem remains.
 
Last edited:

dragonme

Active Member
Apr 12, 2016
282
25
28
@gea
What is your OmniOS release?
Update is ultra easy with OmniOS up from 151038 and every step can be undone when booting into a prior bootenvironment
Login via Putty as root and copy/paste commands (mouse right click)
Code:
pkg unset-publisher omnios
pkg unset-publisher extra.omnios

pkg set-publisher -g https://pkg.omnios.org/r151046/core omnios
pkg set-publisher -g https://pkg.omnios.org/r151046/extra extra.omnios

pkg update pkg
pkg update

reboot

if your OmniOS is prior 151038, you must update in steps over following lts releases (ex 151030, 151038),
use same commands with lts release number instead 151046

If your napp-it is too old (gives errors on menu users), update via about > update at least to 21.06 free

A unnoticed file corruption on ZFS is extremely unlikely.
I would first update and check other items later if the problem remains.
I almost guarantee that fails coming from OmniOS 5.11 omnios-r151022-f9693432c2
 

gea

Well-Known Member
Dec 31, 2010
3,163
1,195
113
DE
151022 is indeed very old (2017). While it is possible to update 022 -> 030 - 038 -> 046, you may find problems in the step to 030 due the switch from SunSSH to OpenSSH with different settings or that older gcc must be removed prior update.

It is indeed propably easier and faster to deploy a current napp-it ova template based on 151046 (up from ESXi 6.7) and re-register VMs (mouse right click on .vmx file)
 

dragonme

Active Member
Apr 12, 2016
282
25
28
Yeah that is all well and good but you still can’t explain why my other datasets work fine in the same pool

so it is not

nappit version
osx version
SMB configuration on server or client

its a permission issue or corruption on that dataset.

I have tried everything

some times the first file will transfere and finish any other file or even a folder create will hang then beachball until server side SMB is turned off and SMB server reset
 

gea

Well-Known Member
Dec 31, 2010
3,163
1,195
113
DE
I have no explanation about the different behaviours but what I can say is that OSX SMB is quite critical and if you google about OSX SMB problems you will find many tipps and tricks. What I can say is that a newer OSX and an older SMB server gives more problems than a current combo. Up from Win 11 or OSX 11 OSX SMB even doesn't work at all without some tweaks.

In the end you can either update OmniOS SMB or SAMBA to a current release and use faster smb3 (smb://ip) or stay with old smb1 (cifs://ip) instead. Timemachine over SMB only works with SMB3 and a newer OmniOS with OSX extensions.
 

dragonme

Active Member
Apr 12, 2016
282
25
28
again.. upgrading to a newer version of nappit is not the answer.. seems like later versions of openzfe, regardless of platform has many more issues than what I am currently using.

I think its the nappit gui that has done something to the acl/permissions of this file system.. its the only explanation on why the same Mac can have no issues with the other file systems in the same pool

if I upgrade zfs, its going to import that pool and file systems, and since the kernel smb saves its settings in that pool/filesystem's .zfs/share its just going to have the same issues.

I have found another thread here where another user had a similar issue where acl/permissions became unusable when using a combination of command line and nappit gui so I am investigating that to see if it is a potential fix

I am also seriously considering promoting a snapshot, investigating what files might have updated so I can get local copies of those before loosing them in the rollback.

I am unsure if the rollback would also remediate .zfs/share file so I need to investigate that

still looking for some assistance from someone that actually understands and knows the more advanced manipulations of zfs acl to assist in a complete reset of permissions

I just can understand how nappit can allow the first file to transfer fine, finish, but any other activity after that beachballs and hangs, while the same server, same pool, different filesystem does not have that issue...
 

gea

Well-Known Member
Dec 31, 2010
3,163
1,195
113
DE
On file .zfs/shares/'sharename' there are no SMB settings beside ACL that you can check in menu ZFS filesystems > ACL on shares. And as said, this file is deleted/recreated when you re-enable a share with a setting everyone@=full.

To reset all file ACL, either SMB connect from Windows as root to reset ACL recursively or use menu ZFS filesystems > ACL on folders, then click on a filesystem. Below the menu there is a button "reset ACL". Select modify and recursive to allow everyone modify files (with many files, wait a few seconds for the menu).

reset-acl.png
 

oneplane

Well-Known Member
Jul 23, 2021
845
484
63
@dragonme : Have you checked what is actually happening on the protocol level yet? Wireshark is free. If you measure facts, you can make progress. Everything else is just guessing.
 
Last edited:

gea

Well-Known Member
Dec 31, 2010
3,163
1,195
113
DE
I am not involved in OS development (On Solaris and Illumos storage services like iSCSI, NFS and SMB are part of OS),
this is more a question for Topicbox

But in general they will also say, that they do not care about years old releases. Current OmniOS is too different as is current OSX.